Wed May 24 08:57:58 PDT 2006 T0299 Make started Wed May 24 08:58:31 PDT 2006 Running on orcas.cse.ucsc.edu Wed May 24 21:14:02 PDT 2006 Kevin Karplus No good hits in pdb.blast. All the iterated searches seem to end up with essentially the same sequence logos. No good fold-recognition hits (best E-value is 2 for 1atg). No agreement among top alignments when superposed. The try1-opt2 run does not actually look too bad: there is a sheet using some of the predicted strands and the burial properties are decent. It's probably not right, but it is at least moderately plausible. T0299 seems to have based try1-opt2 on 1qwkA, though another run might well pick a different template, since this didn't get chosen until the 8th TryAllAlign call. For try2, I'll reduce the sidchain weight a little, and increase the break weight. I'll also add sheet constraints from try1-opt2.sheets and scale the rr constraints down by a factor of 10. Thu May 25 15:45:28 PDT 2006 Kevin Karplus The try2-opt2 model is again based on 1qwkA, and seems to have similar disagreements on secondary structure with the neural nets that try1 does. Should be we try forcing a hairpin for V127-F140 and move D112-L121 to be the helix currently done by V127-F140? Can we do this with sheet and helix constraints? Do we need to add new hand-tweaked alignments? Sat Jun 10 14:29:14 PDT 2006 George Shackelford Well, looking at try2, I think I'll do my usual 'throw the dice', by dropping the *.under constraints from the costfcn, boost the rr.constraints back to max, and do a try. Might as well see if we can get some the features that Kevin was looking for. try3 running on orcas Sun Jun 11 12:33:18 PDT 2006 Kevin Karplus With the unconstrained costfcn, try1-opt2 scores best, then try3-opt2, then try2-opt2. Costfcn order of top from each run try1 try2-opt2 try1-opt2 try3-opt2 try2 try2-opt2 try1-opt2 try3-opt2 try3 try2-opt2 try1-opt2 try3-opt2 unconstrained try1-opt2 try3-opt2 try2-opt2 The undertaker models are generally forming more sheets than the other servers. SAM_T06_server_TS1 is scoring better than try1, try2, or try3, partly because of forming more sheet. The sheet constraints can be found in decoys/try*.sheets and decoys/SAM_T06_server_TS1.sheets, but I don't believe that there is much consistency to the sheets. Sun Jun 11 17:27:35 PDT 2006 George Shackelford I need to get back to a new try here, I'm spending too much on T0296. Since I see that the best scores are e10-1 and higher, I'm going to try the same kind of rr.t2k.constraints that I used on T0296, i.e. start with the standard 449a_45 predictor and use data based on t2k rather than t04. I'll crank them down by a factor of .2. I'm dropping the try1 sheets. I want to see if I can get something based more on the ehl2 constraints. For now I'm going to drop the T0299.t04.undertaker-align.under and T0299.t06.undertaker-align.under as part of try4.under. I'm really kicking things here. For now I'm leaving TryAllAlign in. try4 on peep. Mon Jun 12 10:43:48 PDT 2006 Kevin Karplus try4-opt2 looks pretty junky. Even with the try4 costfc, try2-opt2 scores better. I think that we *need* sheet constraints for this one---we just have to figure out which sheet constraints we believe in. The rr information may help us a choose a topology, but George is relying too heavily on it. (George has also not been using auto-fill-mode when editing the README file, so I have to keep doing fill-region to make his text more readable. I use a wide screen in order to read the score-all tables, but that makes text with very long lines very difficult to read.) Mon Jun 12 10:50:50 PDT 2006 George Shackelford (Sorry, I haven't been using my hard word wrap mode as I should. I got running pretty fast getting stuff ready.) Kevin is right. When you look at the ehl2 script, we are certainly missing the proper sheet. I'm going to work with the current sheets and the ehl2 logo to see how to build the right sheet. The helices should fall into place when we get it right. Mon Jun 12 11:01:31 PDT 2006 Kevin Karplus Stop the presses! The Pcons6_TS4 model (the top scoring one that isn't ours in the servers models using unconstrained.costfcn) looks really good for the sheet formation, though it is a bit loose and burial is not ideal. It sure beats anything we have. Let's pick up sheet and helix constraints from it and try making our own version. The next best is Pmodeller6_TS2 with essentially the same model. I'll set up try5 using these constraints. We may also want to make sure that whatever fold they stole it from is represented in our set of possible alignments. I'm starting a VAST run Request ID: 1045470502798811451 with the Pcons model to find out where it is from. Mon Jun 12 11:22:24 PDT 2006 Kevin Karplus VAST lists the lowest P-value hits to the Pcons model as 1gzjA 1edg 1cen 1vjzA 1rh9A 1h4pA 1dxeA these are mostly c.1.8.3 TIM-barrel variants In pcem/indexes I ran grep 'c[.]1[.]8[.]3' < scop-in-scop.ids | keep-ids t2k.ids to get a list of these templates in the t2k library---there are a lot of them: 1xyzA 1r85A 1hizA 1n82A 1ur1A 1bg4 1edg 1ceo 1cec 1eqcA 1cz1A 1h4pA 1vjzA 1eceA 1c0dA 1h1nA 7a3hA 1a3h 1h5vA 1egzA 1g0cA 1g01A 1bqcA 1qnrA 1qnoA 1uuqA 1fobA 1hjsA 1odzA 1gw1A 1j9yA 1ghsA 1aq0A 1ghr 1jz8A 1dp0A 1bglA 1bhgA 1v0lA 1od8A 1e0wA 1b31A 1clxA 1i1wA 1k6aA 1gokA 1tux 1xyfA 1fh9A 1fhdA 2his 1nq6A 1ta3B 1ogsA 1nofA 1qw9A 1uhvA We had a number of them come up in the scoring: grep -h 'c[.]1[.]8[.]3' *scores.rdb | awk '{print $1}' |sort | uniq yields 39 of them 1a3h 1aq0A 1bg4 1bqcA 1ceo 1clxA 1e0wA 1eceA 1edg 1eqcA 1eqpA 1fh9A 1fhdA 1fobA 1g0cA 1ghr 1ghsA 1gokA 1h1nA 1h4pA 1h5vA 1hjsA 1i1wA 1j9yA 1k6aA 1n82A 1nofA 1nq6A 1od8A 1ogsA 1r85A 1ta3B 1tux 1uhvA 1ur1A 1uuqA 1v0lA 2his 7a3hA I'll make extra_alignments for these as MANUAL_TOP_HITS, and make sure that they get included in the try5 optimization run. Making all these alignments is a bit slow with the current Makefile. Mon Jun 12 16:58:41 PDT 2006 Kevin Karplus It is worse than slow---I keep getting execvp: /projects/compbio/bin/i686/hmmscore: Argument list too long error messages. This may be a result of trying to change some $(foreach) calls into for ... do ... done calls, though I don't really see how. Mon Jun 12 19:10:36 PDT 2006 George Shackelford TIM barrel? I don't think so. I simply don't believe that structures based on sequences of length 250 or more are a good model for a 180 length sequence. In fact I think that such alignments need to be avoided. What I can believe is a partial barrel around a helix or a sheet between helices. We should be looking for examples of those not more TIM barrels. Mon Jun 12 20:56:20 PDT 2006 Kevin Karplus try5-opt2 was a miserable failure. George, did you *look* at the Pcons6_TS4 model? It is a hell of a lot better than anything we've come up with so far. Mon Jun 12 22:26:53 PDT 2006 George Shackelford Ah, yes. I did look at Pcons6_TS4 and Pcons6_TS2. That's when I got concerned with TIM barrel wannabes. Perhaps you might like to look at 1prxB and especially 1i6wA. I'm going to include these and some others from T0299.t2k.best-scores.rdb: 1atg 232 2.8900e-03 1.4157e+00 1atg c.94.1.1 35834 1qapA 290 1.4100e-01 8.0976e+00 1qpoA c.1.17.1,d.41.2.1 29559,38595 1prxB 219 1.4300e-02 8.2115e+00 1prxA c.47.1.10 33077 1prxA 221 4.1600e-02 8.3273e+00 1prxA c.47.1.10 33076 1xzoA 173 2.6000e-01 8.3395e+00 1i6wA 180 4.0800e-01 1.0862e+01 1i6wA c.69.1.18 61859 1vjrA 262 1.0500e-02 1.1143e+01 c.108.1.14 100832 1z5oA 236 5.8300e-01 1.1190e+01 1yqgA 264 7.2600e-02 1.2361e+01 1z5nA 235 6.6200e-01 1.2903e+01 1ispA 180 5.1600e-01 1.4471e+01 c.69.1.18 76778 1k68A 141 6.8800e-01 1.7099e+01 c.23.1.1 90943 1zr6A 480 4.1900e-01 1.7561e+01 1t6nA 208 8.5900e-02 1.8189e+01 c.37.1.19 106576 1oz9A 142 2.1200e-01 1.9601e+01 d.92.1.15 93808 1tib 270 3.2400e-01 2.0642e+01 3tgl c.69.1.17 34738 1fjeB 176 1.3200e-01 2.1447e+01 1fj7A d.58.7.1,d.58.7.1 39210,39211 1jevA 518 3.0600e-01 2.1703e+01 1jevA c.94.1.1 35720 1ghsA 307 1.5200e+00 2.2064e+01 1aq0A c.1.8.3 28840 1yw3A 201 7.4500e-01 2.2162e+01 c.69.1.x is appealing along with c.47.1.x and d.58.7.1. Time to drop TryAllAlign and focus on these. Darn. Ran into the same problem Kevin did. Kevin needs to use do...done rather than foreach in that part of Make.main. I started to do the changes but finally gave up. I left my version of Make.main under the T0299 directory. Tue Jun 13 13:02:53 PDT 2006 Kevin Karplus I made changes yesterday to Make.main to switchover from (foreach) to for..do..done, and they made the long-line problems *much* worse, which was entirely unexpected. I suspect a bug in make, which may or may not have been fixed in a newer implementation. I'm going to try a different fix today, but right now I'm more worried about the loss of oldmacd and /projects/compbio/tmp. Until the tech staff get oldmacd up again, nothing is going to work right. Tue Jun 13 14:43:47 PDT 2006 Kevin Karplus The first thing I tried, using "define" to make macros that contained returns, simplified the code for Make.main, but did not fix the problem. The second thing I tried, turning off the "export" of variables to the submakes, did fix things. I am now making extra_alignments for a fairly long list of templates. Tue Jun 13 17:22:44 PDT 2006 George Shackelford I have commented out Kevin's long list and introduced my own set of MANUAL_TOP_HITS. I'm running make extra_alignments now. I've removed the TryAllAligns and I'm using the MANUAL_TOP_HITS along with try1,try3,try4. Try6 is running on peep. Tue Jun 13 18:45:56 PDT 2006 Kevin Karplus I have to do a soft submission tonight or tomorrow morning. George, you can give me a list of your 2 or 3 favorite models. I'm going to do one run to polish up the Pcons model, just so that there is something reasonably compact to submit. (Running as try7 on orcas). Tue Jun 13 19:19:50 PDT 2006 George Shackelford For now I like try1 or try2. Try3 and try4 are at least different though contrived. Tue Jun 13 20:26:32 PDT 2006 George Shackelford Try6 seems to be improving on try4. I'm going to do another try without including the old tries. I want something new. try8 running on peep. Tue Jun 13 21:38:29 PDT 2006 Kevin Karplus I'm going to do a preliminary submission tonight, in case our servers fail again. The submission is ReadConformPDB T0299.try7-opt2.pdb ReadConformPDB T0299.try6-opt2.pdb ReadConformPDB T0299.try1-opt2.pdb ReadConformPDB T0299.try3-opt2.pdb ReadConformPDB T0299.try2-opt2.pdb (the order with the unconstrained cost fcn) We can fight about the ordering (or find a better model) before the real submission. Wed Jun 14 13:36:06 PDT 2006 Kevin Karplus Just noticed that the template-lib html files are messed up, though the rdb files seem ok. Wed Jun 14 16:49:35 PDT 2006 George Shackelford Try8 was as bad as ever. Forget it. After consulting with Kevin, I built try9 which I ran on peep. I used only 1i6wA's under file as a source of alignments. Still ended up ignoring the suggested strands and helices from t2k.ehl2. I've decided that I should not need any rr.constraints since the basic structure comes from 1i6wA. I'm going to just take that out and try another run. For some reason, there continues to be a long helix from about 65-93. There is no reason for it; I don't see it in 1i6wA either. Well I'll do this "quick" try10. try10 running on peep. Wed Jun 14 19:07:22 PDT 2006 George Shackelford This is frustrating. try10 is novel and wrong. Undertaker is refusing to follow the lead I am trying to give it, and appears to be doing everything it can to avoid a structure like 1i6wA. OK. I'll give it some other structures and I really cranking up the constraints which are now only the ehl2 constraints for strands and helices. Even these are not being respected. I'm going to de-emphasize the alpha predictions to 0.1, turn down those weights on the dry/wet stuff to 1.0. Let's do one more try... Try11 running on peep. Wed Jun 14 21:16:21 PDT 2006 George Shackelford Try11 "blew up." I de-emphasized the dries too much. I've got to push it back together. I've taken the two new includes *.unders back out leaving only the 1i6wA. I've readjusted the cost weights as follows: SetCost wet6.5 1 near_backbone 5 way_back 1 dry5 1 dry6.5 10 dry8 10 dry12 5 \ phobic_fit 3 \ sidechain 3 \ n_ca_c 5 bad_peptide 10 \ bystroff 5 \ soft_clashes 20 backbone_clashes 2 \ break 50 \ pred_alpha2k .1 \ pred_alpha04 .1 \ pred_alpha06 .1 \ constraints 20 \ hbond_geom 5 \ hbond_geom_backbone 10 \ hbond_geom_beta 50 \ hbond_geom_beta_pair 100 \ missing_atoms 1 I've also put in rr.0.1.constraints to see if these can pull some sheets together. At least try11 didn't have those awful long helices. So I'll run try12, but I think I've got to talk to Kevin about building my own hard-wired alignment and scwrling it or repacking using rosetta. Uh,oh. I accidentally replaced the try11.costfcn with the version I meant for try12. try12 running on peep... Wed Jun 14 23:11:10 PDT 2006 George Shackelford I had stopped try12 because I noticed that it was getting an alignment from something that I nad not included. I commented out: # ReadFragmentAlignment NOFILTER SCWRL all-align.a2m ... and restarted try12. YES! Finally! Try12 looks a lot better. Still needs work, but it is getting there. That part I commented out was vital. Now I'm going to remove the rr.0.1.constraints, and put the two other includes back in (for 1ispA and 1yw3A). try13 running on peep! Thu Jun 15 11:15:07 PDT 2006 George Shackelford Try13 is a disappointmnet after try12. It appeared to fixate on 1yw3A which I had included. I have done a VAST search on 1i6wA (I built a PDB with only chain A) and found the following: >gi|14488512|pdb|1I6W|B Chain B, The Crystal Structure Of Bacillus Subtilis Lipase: A Minimal AlphaBETA HYDROLASE ENZYME >gi|2194040|pdb|1OIL|A Chain A, Structure Of Lipase >gi|3402115|pdb|1JFR|A Chain A, Crystal Structure Of The Streptomyces Exfoliatus Lipase At 1.9a Resolution: A Model For A Family Of Platelet- Activating Factor Acetylhydrolases >gi|7766909|pdb|1EI9|A Chain A, Crystal Structure Of Palmitoyl Protein Thioesterase 1 >gi|34810083|pdb|1PJA|A Chain A, The Crystal Structure Of Palmitoyl Protein Thioesterase-2 Reveals The Basis For Divergent Substrate Specificities Of The Two Lysosomal Thioesterases (Ppt1 And Ppt2) >gi|61680204|pdb|1WB4|A Chain A, S954a Mutant Of The Feruloyl Esterase Module From Clostridium Thermocellum Complexed With Sinapinate ... While there were others, I'm going to start with adding these to the MANUAL_TOP_HITS. 1oilA 1jfrA 1ei9A 1pjaA 1wb4A They will also be included while 1yw3A will be excluded. It may be the bad influence that affected try13. I like the sheets from try12. I'm going to include those as well. In try12 I included rr.0.1.constraints. I have re-factored to 0.3 so I'm including rr.0.3.constraints. I wonder how they helped in putting try12 together. Sometime I should do a try where that is the only difference. Try14 running on peep (where else?). Thu Jun 15 16:04:40 PDT 2006 George Shackelford Try14 looks nice, I think we're getting there. There is still some sheet work to do by getting the two strong sheets together but an hbond already exists. Doing an unconstrained scoring shows that the difference between try7 and try12 is mainly in some of the wet/dry scores, and the alpha predictions. I beg to differ. The phobic fit is about the same and try14 is better on breaks with try7 better on soft clashes and n_ca_c. I can't see that wet/dry should be as high as they are along with questionable alpha predictions. I am interested in seeing what the impact of the rr.constraints is. I'm going to do a run the same as try14 without them and see what forms. Try15 running on peep - again! Thu Jun 15 20:15:08 PDT 2006 George Shackelford So try15 actually came out about right. It was different from try14 but it still has about the right shape, however, it didn't score as well except by it's own criteria. Of course that means Undertaker is doing a good job fulfilling the constraints. I decided to see what we could get by dropping the sheets, putting in the rr.t2k.0.2.constraints (built using 449a_45 with t2k data and factored by 0.2). So one more experiment... try16 running on peep Fri Jun 16 12:54:08 PDT 2006 Kevin Karplus For agreement with 2ry structure, I like try7, try8, try14, try15 except for the strand that is predicted to be helix around A149-K156. None of the models have attempted to make hairpins, though those look quite plausible in the dssp-ehl2 predictions. Make started Fri Jun 16 14:56:36 PDT 2006 Running on cheep.cse.ucsc.edu I'm rerunning the initial make in order to get the o_notor2 and n_notor2 predictions, which can be useful for hairpin prediction. Tue Jun 27 13:24:55 PDT 2006 George Shackelford I was surprised to see that the o_notor and n_notor constraints don't provide single hbond constraints reflecting the hairpins. I suspect the ehl2 and str2 helix and sheet predictions are more accurate than those of notor (though I don't know if this has been tested). The notor logos look interesting but I don' know how to interpret them. I'm rerunning my ehl2 match program to find possible id score per residue 5S 10N 10N 1qc5A 338.803 1.88224 1fnsA 329.935 1.83297 1oakA 329.278 1.82932 1n3yA 322.158 1.78977 1ao3A 321.917 1.78843 1zavA 321.365 1.78536 1uxoA 320.812 1.78229 1atzA 318.223 1.7679 1shuX 316.518 1.75843 1mf7A 315.049 1.75027 1nni1 314.005 1.74447 1kamA 312.463 1.73591 1t0iA 308.882 1.71601 1ispA 308.719 1.7151 1qhxA 305.67 1.69817 1mjnA 304.165 1.68981 1rttA 303.697 1.68721 2arkA 302.753 1.68196 1xqiA 298.775 1.65986 1qf9A 297.298 1.65166 Despite the 'good' score/residue, I'll believe it when I see it. I'm running the make extra_alignments and read_alignments and I am finding that there are a number of new hits. That could help but I am waiting to see what I can get out of try17. I could also do a VAST on try15 and see what comes up. I may find a better structure... I took the top ten above and set up try17 to use those as alignments for generating new decoys. Ditto for the the second set of ten ids and try18. try17 running on orcas try18 running on lopez Tue Jun 27 20:06:01 PDT 2006 George Shackelford Try17 turned out well though it has bad breaks. It scores almost as well as try7, the TIM wannabe. I'm going to find out where try17 comes from. 1oakA 1qc5A One look at 1qc5A and it's clear. 1oakA was also in the mix. I think I'll comment both of them out for try19 and see if anything else comes up. Try19 running on peep Try18 actually does ok, but it looks weird. I'm sure it's reasonable, but I put it at the bottom of my list. Wed Jun 28 13:45:43 PDT 2006 George Shackelford Review time - I now have several different models including try7, the TIM wannabe. The others are based on specific - too specific - models as follows (using more_unconstrained): try19 - 1oakA based. Basically alpha/beta/alpha sandwich. Matches the ehl2 predictions well. try7 - 1qwkA based. TIM barrel-like. May be a polished version of try1 or try2. try6 - This is one of those cases where I mixed ReadConformPDB's with "include read-alignments-scwrl.under"'s. The results is simply a polishing of try4-opt2. The structure is alpha/beta with some alphas on the sides. I have no idea where try4 got it from. try2 - 1qwkA based. TIM barrel-like. try14 - 1i6wA based. This alpha/beta/alpha sandwich has a two three-strand sheets that nearly form a six-strand sheet. Scores well with more_unconstrained in part because I dropped the weights on the dry costs which is similar to more_unconstrained. It does not appear that I've followed up with an effort to complete the sheet. try17 - 1qc5A based. 1qc5A scored better than 1oakA in the ehl2 match, but try17 doesn't do as well as try19. Maybe it can with some work. NOTE: this try has a strange break where the alignment fails to complete a strand! try18 - 1t0iA(local) 1rttA(l) 1qgxA(l) 1shuX(l) 1mf7A(l) actually it looks like "other" alignments... ---------- Because I had made mistakes about using ReadConformPDB with read-alignments-scwrl.under, and removing TryAllAlign when I shouldn't have, everything below about try12 is suspect except for try1, try2, and try7. That takes try6 out of the picture. I need to complete the sheet for try14 even try20 seems to be coming on strong. try9 needs looking into. WHOA try20 jumped to the top! Wed Jun 28 22:32:06 PDT 2006 Kevin Karplus Current order with unconstrained costfcn: try7, try20, try19, try6, try10, try8, try9, try18, try1 Current order for rosetta repacking: try11, try16, try20, try8, try7, try2, try17, try6, try10 Current order with try20 costfcn: try7, try17, try20, try8, try19, try6, try15, try14, try10 Current order with try1 costfcn: try20, try7, try2, try1, try14, try17, try8, try15, try18 Current order with more_unconstrained costfcn: try20, try19, try7, try6, try14, try2, try9, try18, try4 Current order with try7 costfcn: try7, try5, try20, try14, try15, try18, try2, try1, try19 Wed Jun 28 22:50:34 PDT 2006 Kevin Karplus I don't particularly like try8 or try9. I'm thinking of going with try7, try20, try19, try6, try14 Wed Jun 28 22:56:33 PDT 2006 Kevin Karplus I have submitted those models. We still have time to do another submission on Thurday is George has a disagreement with my choice or has a new, stronger model to suggest. (George, send email if that is case, as I won't be monitoring the README). From: George Shackelford To: Kevin Karplus Subject: T0299 submission Date: Thu, 29 Jun 2006 01:53:24 -0700 It's a good selection of models. Because try19 does well in matching our ehl2 prediction, I like: try19, try7, try20, try6, try14 I'm a bit worried about try6. It turned out to be a polished version of try4. I still haven't figured out what the heritage of try4 is. I'm afraid that it is based on one of the server models, otherwise it looks good. If it's based on a server, I suggest replacing with try18. try19, try20, and try14 have bad breaks that could be fixed if we have enough time. Is it just a matter of ReadConformPDB plus cranking up breaks to 200? - George ------------------------------------------------------------ Thu Jun 29 08:00:48 PDT 2006 Kevin Karplus try6 was optimized from try4. try4 was optimized from alignment to 1cydA, according to try4.log.gz I think I like the current order better than putting try19 first, so I'll leave it alone, unless a better model is created today. Thu Jun 29 13:20:33 PDT 2006 Some polishing efforts for try19, try20, and try14 Polishing try19 as try21 on vashon Polishing try20 as try22 on vashon Thu Jun 29 14:48:57 PDT 2006 George Shackelford T0299.try21-opt1 actually looks good - no breaks that I can see (though there can still be some less visible ones). I'm going to do the same with try14. Try14 still doesn't form a full 6-strand sheet but I'm not going to press for that. Polishing try14 as try23 on orcas OK, I don't like try7 but in all fairness I should try and mend the breaks there as well. Polishing try7 as try24 on orcas Thu Jun 29 17:47:53 PDT 2006 Kevin Karplus With the try24 costfcn, the order is try22 < try20 < alignment to 1oakA try24 < try7 try21 < try19 try23 < try14 (George did not try to polish try6) Rosetta least hates repacking try11, try22, try24, try16, try20, try8, try7, ... I'll resubmit try24-opt2, try22-opt2, try23-opt2, try21-opt2, try6-opt2 Thu Jun 29 18:00:45 PDT 2006 Kevin Karplus So submitted.