Tue Jun 13 12:40:32 PDT 2000 Kevin Karplus T0100 Pectin Methylesterase, E. chrysanthemi One weak match with blast: 1acx No matches with double-blast. T2K alignment finds many pectin methylesterases (119), but nothing else. Searching with the target model gets weak hits: Template E-value FSSP SCOP 1ex1A 0.69 1ex1A 3.1.7 1cov3 3.5 1aym3 2.9.1 2lefA 9.3 2lefA 1.22.1 1a0c[ABCD] 12. 1a0cA 3.1.14 1iiv[ABC] 33. Theoretical model The T99 template library search yields only weak hits also: 1cg2A 24 1cg2A 3.51.4, 4.48.17 2izhB 27 1swuA 2.57.1 2bltA 31 1gceA 5.3.1 1pea 35 1pea 3.88.1 No two-way hits. The secondary structure prediction is mainly beta, with just a couple short helices, giving more weight to the 2.9.1 and 2.57.1 hits. Wed Jun 14 15:15:34 PDT 2000 Kevin Karplus Extending the thresholds with T0100.remote-t2k does not change the top hits much: 1ex1A 1.6 1ex1A 3.1.7 1cov3 2.6 1aym3 2.9.1 2lefA 8.2 2lefA 1.22.1 1a0c[ABCD] 17. 1a0cA 3.1.14 1cdo[AB] 25. 2ohxA 2.33.1, 3.2.1 1a5iA 26. 1ajsA 2.44.1 1ev13 27. 1aym3 2.9.1 1qrbA 28. 1qa1A ? 1ak5 29. 1ak5 3.1.6 1gfq 31. 2omf 6.4.3 I'm a little more encouraged by the 2.9.1 hit for 1cov3 now, though perhaps not for any good reason. The alignment 1cov3/T0100-1cov3-local.pw does not look good. Only one conserved residue, one beta sheet is missing an interior strand, and the other beta sheet has only one strand aligned. The Viterbi alignment T0100-1cov3-vit.pw gets more conserved residues (12), but has only 2 out of 4 strands of one sheet, and 1 out of 4 on the other. The global alignment 1cov3/T0100-1cov3-global.pw gets 39 conserved residues, but at the cost of many gaps (5 deletions and 16 insertions). If we drop off the N-terminus (which is rather unstructured in 1cov3), the remaining beta sandwich looks pretty good, with conserved residues lining up in stripes across the sheets. The 1aym3/T0100-1aym3-global.pw alignment is similar to the 1cov3 global alignment, but with fewer conserved residues. Unfortuantely fssp2a2m has not been able to create a 1aym3.fssp.a2m file. If Mark can fix that, we can try forcing an alignment to 1cov3 using that fssp alignment to create the model. Thu Jun 15 11:10:27 PDT 2000 Kevin Karplus Got a new 1aym3.fssp.a2m file (Mark Diekhans edited the fssp file to remove the alignment error.) The 1aym3/1aym3-T0100-fssp-global.pw looks terrible. I trimmed the 1cov3/T0100-1cov3-global.pw to the core part: 1cov3/T0100-1cov3-hand1.a2m There is still a bad beta strand in the middle, that I see no easy way to fix. Thu Jun 15 15:29:31 PDT 2000 Kevin Karplus Looked at the SAM-T99 results, which had only target-model hits: chain E-value FSSP SCOP name 1aqh 4.7 1smd 2.66.1,3.1.7 Alpha-Amylase 1aqm 4.7 1smd 2.66.1,3.1.7 Alpha-Amylase 1b0iA 4.7 1smd 2.66.1,3.1.7 Alpha-Amylase 4nn9 5 2qwc 2.63.1 Neuraminidase N9 (E.C. 3.2.1.18) (Sialidase) 1nccN 5 2qwc 2.63.1 N9 Neuraminidase-Nc41 (E.C. 3.2.1.18) Mutant 1ncdN 5.7 2qwc 2.63.1 N9 Neuraminidase-Nc41 (E.C. 3.2.1.18) Comple 1nmaN 6.7 2qwc 2.63.1 N9 Neuraminidase 1bn8A 7.6 1pcl 2.75.1 Pectate Lyase 2.66.1 is a "folded-sheet greek-key", 2.63.1 is a 6-bladed beta propeller, and 2.75.1 is a right-handed beta helix. The pectate lyase is particularly interesting, since it binds the right substrate. With global scoring the T2k target model scores 2qwc better than 1cov3. The alignment 2qwc/T0100-2qwc-global.pw misses 2 of the 6 blades though, so the fold is rather unlikely. The 1bn8A/T0100-1bn8A-vit.pw and 1bn8A/T0100-1bn8A-local.pw alignments just pull out two turns of the beta helix. The global alignment 1bn8A/T0100-1bn8A-global.pw covers the helix, but has few conserved residues and one of the middle strands is missing (probably the beta strands are misaligned). Fri Jun 16 11:47:15 PDT 2000 KEvin Karplus I looked at the predicted structures from the CAFASP site, and saw some moderately strong consensus families: 27 2.75.1 9 3.1.7 7 2.1.1 6 2.63.1 4 2.66.1 3 2.9.1 3 2.65.2 3 2.64.2 Since we have an almost all beta secondary structure prediction, we should probably look into alignments for 2.75.1 (pectate lyase) and 2.1.1 more carefully. Mon Jun 26 09:48:46 PDT 2000 Remade 2ry predictions Mon Jun 26 09:52:01 PDT 2000 Remade 2ry predictions Christian Tue Jun 27 16:09:41 PDT 2000 Prosite entry PS00800; PECTINESTERASE_1 gives some hints: Prokaryotic and eukaryotic pectinesterases share a few regions of sequence similarity [1,2,3]. We selected two of these regions as signature patterns. The first is based on a region in the N-terminal section of these enzymes; it contains a conserved tyrosine which may play a role in the catalytic mechanism [3]. The second pattern corresponds to the best conserved region, an octapeptide located in the central part of these enzymes. Conserved tyrosine: STPFVILIK NGVYNERLTI ^ Second Prosite pattern I SGTVDFIFG Investigating 1bn8A: T0100-1bn8A-global.pw.a2m.gz does not appear to place the conserved patterns anywhere interesting. T0100-1bn8A-local.pw.a2m.gz doesn't align either. T0100-1bn8A-vit.pw.a2m.gz doesn't align either. Investigating 1cov3: T0100-1cov3-global.pw.a2m.gz treats the stretch surrounding the first (conserved Tyr) as an insert and doesn't place the second in a convincing place None of the other alignments align these patterns to structure. Investigating 1aqm: T0100-1aqm-global.pw.a2m.gz is the most interesting so far. The conserved Tyr would be an insert, but it would be an insert just next to where the second pattern matches the structure. Furthermore, there is an interesting bit of conservation around the second pattern. T0100-1aqm-local.pw.a2m.gz shows the second pattern aligning to structure near the active site of 1aqm. There is some degree of residue identity. T0100-1aqm-vit.pw.a2m.gz is not a good alignment. Only a short bit of t100 is aligned to the structure. So far, T0100-1aqm-global.pw.a2m.gz is the best lead. Christian Wed Jun 28 13:12:44 PDT 2000 Investigating 1ex1A: T0100-1ex1A-global.pw.a2m.gz, T0100-1ex1A-local.pw.a2m.gz and T0100-1ex1A-vit.pw.a2m.gz are of some interest. They both align the first part of T100 to one domain-ish region of 1ex1A and there are some interesting conservatin patterns. 28 June 2000 Kevin Karplus Currently I favor the highest-scoring 1czfA/1czfA-T0100-fssp-global.pw.dist:T0100 342 -22.63 -18.32 3.3e-08 ENDO-POLYGALACTURONASE II FROM ASPERGILLUS NIGER But there is good reason to like the 1qcxA/1qcxA-T0100-fssp-global.pw.dist:T0100 342 -75.63 -13.48 4.2e-06 alignment, which is to a pectin lyase, and puts the binding site on an arm that swings over the binding pocket. The simple scores don't seem to be very helpful, with 1ex1A and 1qa1A looking rather poor. 29 June 200 Christian 1aym3 is an extended structure with an unconvincing alignment. It is mainly matching beta strands. 1bhe is another beta helix. There are some nice conservation patterns for part of the alignment, but there are also some serious strand deletions. Not our best beta helix match. If you do look at this structure, the global alignments are what you want to look at. 1czfA is a beta helix. There are some decent conserved residues in part of the 1czfA-T0100-fssp-global alignment, but once again some problematic deletions and insertions. The known active-site Tyr is not positioned near groove of the helix. This, though, is one of our to-this-point favorites. 1dabA is a more extended beta helix than any of the previous. Neither of the Prosite patterns are placed near the supposed binding groove. There is some level of conservation, but too indels to make this structure a plausible template. 1dbgA is also a beta helix. There are some serious indels, but the main residues for the Prosite patterns for the (T100) family *are* conserved in 1dbgA. See the 1dbgA-T0100-global alignment. 1rmg is a beta helix that has some decent residue conservation. The active site Tyr is placed just next to where 1rmg has bound ligand and the Tyr is conserved in the structure. The other pattern may play a role in structural stability, as it is aligned on a strand-turn-strand. I think it is too far to be active site related. 1rmg is a possible contender. 1tyv is a longer type of beta helix. None of the alignments really agree. The Prosite patterns match to inserts in the best alignment (1tyv-fssp). So, in addition to Kevin's best-so-far mentioned above (1czfA and 1qcxA), I think the other contenders are 1dbgA and 1rmg. Of these four, I think 1czfA - the alignments are a bit too broken, and I don't know if the Tyr is poorly placed in the 1czfA-fssp alignment (which is probably the best alignment. Not one of the top two. 1qcxA - The 1qcxA-fssp has plausible Prosite pattern placement, the indels are moved to the loops, and the conservation is oaky. The 1qcxA-global alignment might be okay, except for the rather serious strand deletion in the middle of the helix. The 1qcxA-fssp alignment is in the top two. 1dbgA - All of the alignments are too broken to make much of a case for this one. 1rmg - The alignment here is compact and the residue conservation is interesting. The Prosite patterns are sensibly placed. I think the best alignment for this structure is 1rmg-T0100-global.pw.a2m.gz. The global alignment is a close second. I think we should predict using the 1qcxA-fssp and 1rmg-T0100-global/fssp alignments. Which is better I cannot really say. Mon Nov 27 13:20:16 PST 2000 The 2.75.1 superfamily is correct. I don't know yet about the alignment. Fri Dec 22 15:54:33 PST 2000 SAM-T2k had a good, but not perfect alignment. First model (1qcxA) slightly better. SAM-T99 correct hit was number 8 (3rd superfamily). Correct fold selected by functional match and by CAFASP hits. Would 2track prediction have helped here? Running 2track to see how it would have done. Discounting self-hit 1qjvA, next best are 1pcl 2.75.1.1.2 1dmlA 2pec 2.75.1.1.1 1air 2.75.1.1.1 2rmpA 2.47.1.2.4 So the right superfamily would have come to the top with 2track prediction, even without functional match.