Kevin's initial summary: ------------------------ I've looked briefly at the VAST alignment results for our predictions in CASP3. I categorized our results as correct 13 correct? 2 I think so, but don't have direct VAST verification so-so 3 similar fold, but not great hit wrong? 2 I think we goofed up, but am not absolutely certain wrong 13 we definitely made a wrong prediction correct: 13 target predicted t44 1eps t47 1mup t48 1dcpA t49 3pte t55 1esl t57 1gd1O t58 1akz t60 1gifA t64 1adr+1ois (only 1adr verified) t68 1rmg t70 2omf t82 1bol t83 1lmb3 correct?: 2 t52 new Vast reports some weak hits t76 1almC Vast reports other hits, but 1almC looks better to me (theoretical model, so not used by Vast) so-so: 3 t46 2mcm both beta-sandwich t74 2scpA right family (calmodulin-like), wrong subfamily t81 3chy Hmm-very similar 5-strand sheets with 5 alpha helices, but threaded in different order (cyclic permutation) wrong?: 2 t79 1new+1san I thought we has a good chance here! t85 2cthA+1ycc+1avc I thought we has a good chance here! wrong: 13 t43 new t53 1fvkA I thought we has a good chance here! t56 2mhr t59 2dri t61 1amj t63 1pex t67 1rhi2 t71 1hviB t75 1oya t77 1tif+1mit t80 1t7pB+1mugA The most impressive of our correct ones are probably t83, t64, and t44, though we seem to have gotten the 1ois part of t64 mis-rotated in the t66 submission (t66 results not returned yet). I'm reasonably satisfied with our results on the so-so ones---they were either quite hard, or we were trying to make distinctions finer than the superfamily or even family level. We got more wrong that I would like, but most of them were quite hard, and possibly no one will get them. I'm a little disturbed at getting t79, t85, and t53 so wrong, becahse I thought that each of them had a fair chance of being right. ----------------------------------------------------------------- Christian's notes 33 targets Target What went right What went wrong t43 X (we had it, but said new fold) t44 X t46 X t47 H t48 H t49 H t52 - - (new, but weak VAST) t53 X t55 H t56 - - (new fold) t57 H t58 H t59 X t60 H t61 - - (new fold) t63 X t64 H t65 - - (not verified) t67 - - (new fold) t68 H t70 H t71 X t74 X X actually partially correct t75 X (Not on radar) t76 no VAST, theoretical t77 - - (looks like new fold) t79 - - (new fold) t80 X (new, but 1 distant possibility) t81 X (cyclic permutation) t82 H t83 X (just to domain 1) t84 - - (synthetic construct) t85 - - (new fold) Threading targets, a correct structure (3) ---------------------------------- t44 t81 ? t83 Threading targets, incorrect prediction (9 or 10) ---------------------------------- t43 t46 t52 t53 t59 t63 t71 t74 ? I think this one is correct to some degree t75 t80 No VAST information (3) ---------------- t65 t76 t84 Homology Targets, not for threading analysis (11) -------------------------------------------------- t47 t48 t49 t55 t57 t58 t60 t64 t68 t70 t82 New fold, thus not for threading analysis (6) ----------------------------------------------- t56 t61 t67 t77 t79 t85 t43 --- We predicted new fold, but were on the right track and could have made a good prediction if we'd submitted our best bet---1ris. 1ris had a sum-score of -7.08, 2nd behind 2end which had -7.14. 1ris has a VAST p-val of .0056, rmsd 2.2, nres 79. We noted that this ribosomal protein was similar to 1pysB, which VAST gave a p-val of 10e-4.7 It looks to me like we missed this one and should be discussed as an example of what went wrong. t44 --- 1eps was about the best structure we could have used as a template, ranked #2 in the neighbors list. t45 --- Not released t46 --- 1psdB is in the neighbor list, and we had a sum score of -2.56. Basically, anything we had for this target was too far in the noise t47 --- homology target 1mup is top t48 --- homology target 1dcpA is essentially the same as the top hit. t49 --- homology target top VAST is 2bltB with pval of e-27.2 Our prediction was 3pte, with pval of e-23.5 t52 --- Anything that VAST considers similar to this target, which isn't much, does not show up in t54-sum.rdb. t53 --- We considered 1ak1, which is the top VAST neighbor. We rejected it because the 1fvkA template allowed us to cluster histidine residues in a manner that would make metal-binding plausible. t55 --- homology target 1esl has p-val of e-12.1, while top has e-14.0 t56 --- new fold t57 --- homology target, 1gd10 top t58 --- homology target, 1akz top t59 --- Nothing that VAST considers structurally similar comes up in t59-sum.rdb. t60 --- 1gifA, which was our prediction, was top in VAST's list t63 --- FSSP says there is a similar structure, 1eif/2eifA. VAST doesn't see this. VAST sees other structures. Moot, since we don't come close. Nothing for domain 1 in any file in t63 target directory. domain 2: 3ullA pval .0070 t63-sum98.rdb:t63 2ull -2.840 Is 2ull the same as 3ull? 1ltiH pval .0221 t63.remote_4-varh50-pdb.rdb:t63 1ltiH varh50 0 2 -1.500 2sob pval .0002 t63.t98_2-varh50-pdb.rdb:t63 2sob varh50 0 2 -1.110 Aside from these and a few weaker scoring structures, we found nothing else. Essentially anything that we had was too far in the noise for us to see. t64 --- We get the easy part, the obvious 1adr hit. t65 --- Nothing yet on this target from the CASP folks. t68 --- Our prediction, 1rmg, is top in VAST t70 --- Our prediction, 2omf, is top in VAST t71 --- In the top 21 nonredundant (down to -4.0 nats) hits in t71-sum98.rdb, we had four correct hits: structure hit t71-sum98 nonredund. rank score VAST pval ------------- ------------------------- ----- --------- 1a6aA 9 -5.04 probably same as others 1dlhA 10 -5.02 0.0002 1aqdA 11 -5.01 0.0001 1sebA 11 -5.01 0.0002 These were all for domain 1. We considered 1dlhA, but didn't find the alignment convincing. We also gave consideration to 1kit, especially in its alignment to the first domain of T74. We also noted that FSSP was similar to 1eut, which FSSP gives a Zscore of 36.1, %IDE 25. 1eut would have been one of the best templates we could have chosen for t71. It has a VAST pval of 10e-6.2. VAST doesn't list anything in PDB as being a particularly strong match to the second domain. Some the structures that VAST does list are chains associated with some of the correct, but weak, hits we found for the first domain: 1a6aB 1dlhE 1aqdB/E/K None of these, though, come up in our searches. Summary, we missed a couple of correct templates for the first domain, but didn't find anything that could've been considered correct for the second. t74 --- This is one we should have gotten, as many or our top t74-sum98 hits (including our top hit), were correct templates. Looking at the top ten: structure t74-sum98 rank & score VAST pval --------- ----------------------- --------- 1cm4A 1 -25.75 10e-4.1 1rro 3 -24.21 0.0259 1rec 6 -23.65 10e-5.4 1ncx 8 -22.03 0.0009 5pal 10 -20.45 0.05 There were a lot of high scoring hits for this target, and it just came down to the issue of which alignment looked most plausible. Thus we chose 2scpA. Looking at the structures, they are topologically identical. The difference lies in the orientation of the helices to one another. Based on the visual comparison, I don't think it is correct to consider this target as being completely missed. I think it fits in both of the categories "What went right" and "What went wrong". t75 --- The only possible correct structure, 1jvr, didn't even come up in our search t76 --- VAST doesn't use our template, since it is a theoretical model. Compare structures by hand t77 --- new fold t79 --- new fold t80 --- VAST only lists one similar structure (1fmtA/B) to this target, which only shows up as a positive-scoring hit in any of our searches. t81 --- cyclic permutation, so VAST doesn't see it. Comparison by hand shows that they are very similar. t82 --- homology target. Ours is top VAST hit. t83 --- domain 1 match, second domain is a new fold T84 --- Our prediction and the actual structure are certainly similar, but cannot get a numerical similarity value since VAST doesn't make comparisons to theoretical structures. So here's my current summary: We made about 40 predictions for CASP3. Information on 33 of these has been released. New fold: 6 Homology targets: 11 Threading targets "what went right": 4 or 5 (T74 included in both categories) "what went wrong": 10 or 9 Not relevant(?): 2 Six targets were new folds. We generally submitted some prediction for every target, even if we knew it wasn't likely to be correct. These targets were: t56 t61 t67 t77 t79 t85 Another 11 of the targets were obvious homology targets. For each of these we predicted about the best template possible. The homology targets are: t47 t48 t49 t55 t57 t58 t60 t64 t68 t70 t82 Since our talk will be on what went right and what went wrong for our fold predictions, I don't envision it discussing any of the above targets. The following targets should be considered for "what went right": t44 t81 VAST doesn't see this, circular permutation t83 Mostly right, I think t74 Topologically the same, wrong subfamily t76 VAST doesn't see this, since our template was a theoretical model. Structures are similar. For "what went wrong", we will probably want to consider these: t43 Predicted new fold, but our 2nd best (weak) hit was correct. t46 Correct template too far in the noise t52 Nothing VAST likes was even remotely found by our methods t53 We rejected the correct answer because we favored another alignment better t59 Nothing VAST likes was even remotely found by our methods t63 Correct template too far in the noise t71 Considered, but rejected, some correct templates for 1st domain of target t74 Also included here since not completely correct t75 The only possible template didn't show up in our searches t80 Correct template too far in the noise There are two miscellaneous target releases: t65 Released structure looks incomplete, it's only two short helices t84 synthetic construct pieced together---didn't use HMM methods