Tue Jul 11 10:14:48 PDT 2000 Kevin Karplus T0107 Family 9 carbohydrate binding module, T. maritima wu-blastp: weak hit to 2nmtA, very weak to 1jl[xy][AB] double-blast: no hits t2k: 29 sequences (none of them in PDB) secondary structure: almost all beta (one helix) CAFASP T99 hits E-value FSSP SCOP 1vpsB 1.9 1vpsA 2.9.1 1vpsA 2.2 1vpsA 2.9.1 1cfr 15 1cfr 3.47.1 1aszB 16 1b8aA 2.38.4, 4.87.1 1asyA 16 1b8aA 2.38.4, 4.87.1 1eerB 18 1eerB 2.1.2 1lylA 24 1lylA 2.38.4, 4.87.1 1fnhA 36 1fnhA 2.1.2 3prn 50 3prn 6.4.3 (Only 1cfr was a target-model hit, the rest were template model hits). T2K Target model hits CHAIN E-Value FSSP SCOP 1qhpA 25 1cxlA ? 2.1.1, 2.3.1, 2.66.1, 3.1.7 1qhoA 25 1cxlA ? 2.1.1, 2.3.1, 2.66.1, 3.1.7 1xyz[AB] 28 1taxA 3.1.7 1cfr 34 1cfr 3.47.1 Template hits (same as SAM-T99 CAFASP hits) CHAIN E-Value FSSP SCOP 1vpsB -9.36 1.96 1vpsA 2.9.1 1vpsA -9.24 2.21 1vpsA 2.9.1 1aszB -7.26 15.98 1b8aA 2.38.4, 4.87.1 1asyA -7.25 16.1 1b8aA 2.38.4, 4.87.1 1eerB -7.14 18.0 1eerB 2.1.2 1lylA -6.83 24.6 1lylA 2.38.4, 4.87.1 1fnhA -6.42 37.0 1fnhA 2.1.2 No bi-directional hits. The best alignments from these hits are simple reverse_null e-value 1vpsA/1vpsA-T0107-global T0107 -2.28 -16.91 1.4e-07 1vpsA/1vpsA-T0107-local T0107 -19.75 -9.28 2.8e-04 1eerB/1eerB-T0107-global T0107 -12.56 -8.60 5.5e-04 1aszB/1aszB-T0107-local T0107 -19.22 -7.08 2.5e-03 1asyA/1asyA-T0107-local T0107 -19.21 -7.07 2.5e-03 1cfr/T0107-1cfr-local 1cfr -16.14 -6.49 4.5e-03 1lylA/1lylA-T0107-local T0107 -17.91 -6.27 5.7e-03 1b8aA/1b8aA-T0107-local T0107 -18.18 -5.93 7.9e-03 1vpsA/1vpsA-T0107-vit T0107 -9.96 -5.85 8.6e-03 1aszB/1aszB-T0107-vit T0107 -11.72 -5.73 9.7e-03 1asyA/1asyA-T0107-vit T0107 -11.71 -5.72 9.8e-03 1eerB/1eerB-T0107-vit T0107 -15.75 -5.62 1.1e-02 1b8aA/1b8aA-T0107-vit T0107 -11.19 -5.39 1.4e-02 1fnhA/1fnhA-T0107-global T0107 -9.39 -5.10 1.8e-02 1lylA/1lylA-T0107-vit T0107 -10.71 -4.74 2.6e-02 1fnhA/1fnhA-T0107-local T0107 -19.86 -4.66 2.8e-02 1eerB/1eerB-T0107-local T0107 -22.14 -4.37 3.7e-02 1taxA/1taxA-T0107-local T0107 -15.10 -3.96 5.6e-02 1taxA/1taxA-T0107-vit T0107 -10.25 -3.39 9.8e-02 1taxA/1taxA-T0107-fssp-global T0107 -13.66 -3.36 1.0e-01 1fnhA/1fnhA-T0107-vit T0107 -10.63 -3.35 1.0e-01 1lylA/T0107-1lylA-local 1lylA -12.89 -3.27 1.1e-01 1cfr/T0107-1cfr-vit 1cfr -6.06 -3.19 1.2e-01 1lylA/T0107-1lylA-vit 1lylA -6.13 -3.12 1.3e-01 1fnhA/1fnhA-T0107-fssp-global T0107 -5.04 -1.94 3.8e-01 1vpsA/1vpsA-T0107-global has a large gap that would be hard to close 1vpsA/1vpsA-T0107-local looks awful---most of the strands of the sandwich are missing. 1eerB/1eerB-T0107-global has two big gaps that would be hard to close. 1aszB/1aszB-T0107-local is compact DNA-binding domain, but secondary-structure match is poor. 1asyA/1asyA-T0107-local looks almost identical to 1aszB/1aszB-T0107-local. 1cfr/T0107-1cfr-local has some helix matches, but not enough to be a complete fold, and secondary-structure match is poor. 1lylA/1lylA-T0107-local has few gaps and some nice conservations striping across the sheet--secondary structure prediction is not great. 1b8aA/1b8aA-T0107-local does not have great conservation, but is otherwise ok. Tue Jul 18 11:47:11 PDT 2000 Kevin Karplus CAFASP hits: 10 2.28.1 8 2.44.1 5 2.9.1 4 3.16.4 3 4.87.1 3 4.13.2 3 2.40.4 3 2.1.1 Note: we do not get the most popular hits (2.28.1 and 2.44.1) but we do get the third 2.9.1 Should we start looking at 2.28.1 and 2.44.1 possibilities? Mon Aug 7 16:32:12 PDT 2000 Kevin Karplus Top hits with 50-50 2track model: 2emo 237 -41.99 -10.20 1.6e-01 4.20.1 1ema 236 -42.96 -10.20 1.6e-01 4.20.1 1a0tP 413 -39.45 -9.75 4.2e-01 6.4.3 1pvc3 238 -25.98 -9.21 4.2e-01 2.9.1 1hcz 252 -27.52 -8.55 1.1e+00 2.2.6, 2.79.2 3pcgM 238 -29.13 -7.95 3.1e+00 2.3.4 2pcdM 238 -29.13 -7.95 3.1e+00 2.3.4 1knb 196 -31.71 -7.67 3.1e+00 2.20.1 1i1b 153 -32.79 -7.32 3.1e+00 2.40.1 1xsoA 150 -32.23 -7.25 3.1e+00 2.1.8 1nkr 201 -33.87 -7.25 3.1e+00 2.1.2 1itbB 315 -31.09 -6.94 8.5e+00 2.1.1 1jacA 133 -34.26 -6.93 8.5e+00 2.72.3 3pchM 238 -28.16 -6.85 8.5e+00 2.3.4 3seb 238 -26.26 -6.74 8.5e+00 2.38.2, 4.13.7 1qhvA 195 -27.41 -6.73 8.5e+00 2.20.1 2bbkH 355 -35.41 -6.69 8.5e+00 2.64.2 1eq6A 189 -27.41 -6.69 8.5e+00 ? 1a8i 842 -20.63 -6.53 8.5e+00 3.82.1 1vdeA 454 -24.17 -6.51 8.5e+00 2.81.1, 4.79.2 (Note: buggy e-value computation) Two-way hits (50-50 2track model and template model) are 2.9.1 1pvc3 (FSSP rep 1aym3) and 1vpsA Viral coat and capsid proteins 2.1.2 1eerB,1fnhA and 1nkr Fibronectin type III Tue Aug 8 10:10:39 PDT 2000 Kevin Karplus The top homologs in the t2k alignment are all xylanases, and the xylanases in FSSP are represented by 1taxA 3.1.7.3.18 1axkA 2.28.1.2.3 2.28.1.11.2 1exg 2.2.2.1.1 1qldA 7.29.1.1.1 1ct7A 7.29.1.1.1 1xbd 2.2.2.1.2 There seem to be several different structures that have xylanase function. If the 1qhpA and 1qhoA hits we got are to the 3.1.7 domain, then we have a strong functional hint. It seems it is, since we get a 1xyzA hit which is a close homolog of 1taxA. The 3.1.7 superfamily are TIM barrels (Glycosyltransferases), but our all-beta secondary structure prediction is not compatible with a TIM-barrel. The best-scoring alignment to 1taxA or 1cxlA is 1taxA/1taxA-T0107-local T0107 188 -15.10 -3.96 5.6e-02 which matches only 33 residues: IGFNIQVNDANEKGQRVGIISWSDPTNNSWRDP with 9 identical residues. The strand and turn look ok, but the helix match is dubious. The popular 2.28.1 hit may be for 1axkA or a homolog. Actually found are hit FSSP 2ltnA 1led 1qmoE 1nls 1azdA 1nls 1xnc 1axkA 1xnb 1axkA 2bvvA 1axkA 1kit 1kit The best 1axkA alignment is 1axkA/1axkA-T0107-fssp-global T0107 188 -5.82 -6.26 7.4e-03 The alignment is poor, because it grabs strands from both domains and omits many. We should probably try a single-domain copy: 1bvv, 1xnc, 1bcx, 1xnb, 2bvvA (almost identical to half of 1axkA) 1xnd, 1re[def][AB], 1enx[AB], ... (about 50% identical) 2nlrA only 11% identical 2.28.1.11.10 1qu0A only 12% identical 2.28.1.4.1 The best 2nlrA alignment is 2nlrA/2nlrA-T0107-local T0107 188 -9.87 0.74 1.5e+00 which aligns nothing. Other local alignments don't look much better (one helix match for 2nlrA/T0107-2nlrA-2track-local). The global alignment 2nlrA/2nlrA-T0107-fssp-global doesn't look TOO terrible, though it scores poorly. Trying to use FSSP models, T0107-fssp-local top hits and T0107-fssp-vit top hits are terrible scores. 13 August 2000 Christian 1fnhA/1fnhA-T0107-fssp-global T0107 -5.04 -1.94 3.8e-01 The alignment is too broken to be promising. 1lylA/T0107-1lylA-vit 1lylA -6.13 -3.12 1.3e-01 Aligns to a helix 1cfr/T0107-1cfr-vit 1cfr -6.06 -3.19 1.2e-01 Short alignment to a helix. 1eerB/1eerB-T0107-local T0107 -22.14 -4.37 3.7e-02 Barely an alignment. Since these were the dregs, I don't think we really having anything worth predicting with. My weight is on new fold. If not, then one of the not-so-hot 1vpsA alignments. 14 August 2000 1ema/1ema-T0107-fssp-global has a moderate amout of conservation, well striped across the beta sheet. Unfortunately, it is missing two strands of the beta barrel and a few other bits of beta strands. The 2-track alignment is even less of the barrel. 1a0tP/1a0tP-T0107-vit.pw just two beta strands. 1lylA/T0107-1lylA-2track-local.pw (best-scoring 2track) Pretty good 2ry match for 3 out of 5 or 6 strnds of a barrel. Not complete enough to be fold prediction. 1pvc3/T0107-1pvc3-2track-local is not compact 1b8aA/1b8aA-T0107-fssp-global is not compact, and pulls out only fragments from a beta sandwich. 14 August 2000 Christian I created 1ema-t0107-fssp-global-cbarrett.pw.a2m, which is a slightly hand-edited alignment. Tue Aug 15 02:25:11 PDT 2000 Kevin Karplus I want to run undertaker, but SCWRL is taking forever on the two most important alignments: 1ema-T0107-fssp-global and 1ema-t0107-fssp-global-cbarrett. (It has been running for over 10 hours on eclipse for just one of these!) I probably need to come up with a quick-and-dirty amino-acid replacement method to use instead of scwrl. Tue Aug 15 09:59:14 PDT 2000 One SCWRL jobs has been running for 18.5 cpu hours on eclipse, so I'm going to run undertaker without the two most important alignments. If either finishes today, I'll do another undertaker run starting from the alignment. Tue Aug 15 16:13:31 PDT 2000 Kevin Karplus Added option to SLICER to accept early output from SCRWL and killed the infinite-loop SCWRL jobs. None of the try1 undertaker conformations looked at all reasonable. Edited Christian's alignment to unalign the strip through the center of the barrel, since it would result in uncloseable gaps, and didn't match the 2ry structure prediction. Moved one strand slightly also. Alignment in 1ema/1ema-T0107-karplus.a2m being used as seed to create undertaker alignments. One pussibility is that this is just a barrel with one more strand than 1ema, and that the stuff through the middle of 1ema is just one more barrel strand in the target. I don't know how to construct such an object though with the tools available. Sat Aug 26 00:28:04 PDT 2000 Remade 2track predictions Sat Aug 26 14:27:18 PDT 2000 Kevin Karplus Remade 2track predictions Now top (weak) hits are % Sequence ID Length Simple Reverse E-value X count 2emo 237 -42.31 -10.27 1.6e-01 1ema 236 -43.20 -10.20 1.6e-01 1a0tP 413 -39.49 -10.20 1.6e-01 1pvc3 238 -26.96 -9.49 4.2e-01 1hcz 252 -29.00 -9.00 4.2e-01 I need to look at the undertaker conformations. 27 Aug 2000 Kevin Karplus The try1 undertaker conformations were terrible (didn't include the 1ema alignments). The try2 undertaker conformations are ok, but they did not succeed in closing the gaps. We're probably best off using 1ema-T0107-karplus.a2m and not predicting whether the joining is done up th emiddle of the barrel or on the outside.