Tue Jun 8 08:56:33 PDT 2004 T0198 DUE 2 Aug 2004 Tue Jun 8 13:46:52 PDT 2004 Kevin Karplus try1 failed mysteriously. I'll put the T0198.t2k.dssp-ehl2.constraints into try1.constraints and run again. This protein seems to be an all-alpha protein, but there doesn't seem to be consensus about the fold. Tue Jun 8 16:28:54 PDT 2004 Kevin Karplus In fact, I suspect that this is a simple up-down bundle, though try1 did not end up bundling the helices. Finding a bundling pattern and getting the helices roughly aligned using loose constraints may be a good exercise for the grad students. Wed Jun 9 08:02:43 PDT 2004 Kevin Karplus I moved the cost function into try2.costfcn and added some loose constraints to encourage bundling. I also reduced the weight of the beta hbond functions for this all-alpha protein. Wed Jun 9 10:43:16 PDT 2004 Kevin Karplus In try2, the bundle is beginning to form, but the constraints are clearly wrong, resulting in distortion of the loops that join the helices. I forgot to comment out the output of the Template.atoms file in try2.under---that should be done for try3.under to keep things quick. This one should be straightforward enough for the students to try playing with it, even when I'm not at UCSC to guide the work closely. Tue Jun 15 15:54:55 PDT 2004 Martina Koeva In try3, I have left 2 of the old constraints and replaced the other 3 with new constraints. The addition of the new constraints is an attempt to pull out the helix starting at residue K144 out on the outside of the bundle and pull in the helix between residues 180 and 193 inside of the bundle. : # loose constraints for up-down bundle # These should be made more specific once the bundles start forming. # Some of the constraints have changed from try2 to attempt to bury # one of very hydrophobic helices inside the bundle. constraint R3.CA L69.CA -10 10 20 1.0 constraint I84.CA L182.CA -10 10 20 1.0 constraint P115.CA M171.CA -10 10 20 1.0 constraint D118.CA K161.CA -10 10 20 1.0 constraint I119.CA A180.CA -10 10 20 1.0 constraint A121.CA L184.CA -10 10 20 1.0 constraint E128.CA R150.CA -10 10 20 1.0 constraint A133.CA A187.CA -10 10 20 1.0 I have also increased the number of superiterations to 5 and have commented out the output of the Template.atoms file in try3.under. Wed Jun 16 16:11:09 PDT 2004 Martina Koeva So the constraints that I chose did not work very well and there are quite a few distortions in the loops. I did not manage to pull the helix starting at residue 144 out as really wanted. So for try4, I have left some of the constraints, but took out the constraints: constraint I84.CA L182.CA -10 10 20 1.0 constraint P115.CA M171.CA -10 10 20 1.0 I tried putting in isntead a set of constraints that will bring helix 3 from residues 77-103 together with helix 1 (from residue 6-35 approximately) and have them align at least within their predicted t2k and t04 hydrophobic regions together. The same regions also seem to have a number of polar charged residues of opposite charge on the interacting helices, which is making me think I do want the two helices aligned and together. Here is the set of two constraints I added. constraint V16.CA V80.CA -10 10 20 1.0 constraint A19.CA I84.CA -10 10 20 1.0 Fri Jun 25 18:30:27 PDT 2004 Kevin Karplus News from CASP: T0198 -- was solved with 7 iron ions, 1 nickel ion and 1 calcium. Note that in the homologue structure (that was also solved by this group), there is no bound ion. So, it is unknown whether those ions are essential for protein folding. Mon Jul 5 22:12:22 PDT 2004 Kevin Karplus Just looked at try4---the helices have crumpled badly. Perhaps the helix constraints need to be a lot stronger or the bundling constraints weaker. I think that try2 probably comes the closest of any so far. We may want to make some guesses about how the helices bundle by making a physical model---cutting some paper straws to the appropriate lengths, for example. I think that the "near" script provides a clearer view of the buried and exposed sides of the helices than the "burial" script does, though we may want to tweak the colors for greater clarity. Fri Jul 9 15:15:01 PDT 2004 Martina Koeva I went back to using try2 as a starting point. As an initial try on the bundle for try5, I just turned down all the general bundling constraints and increased all the dry weights as Kevin suggested to try and improve the bundling and the burial of the hydrophobic side of the helices. Thu Jul 15 13:57:44 PDT 2004 Kevin Karplus try5 stepped on some of try2, but Martina had saved the files. I've renamed the outputs appropriately and scored them (using the try5 cost function) in decoys/score-all.try5.rdb try5 is not looking bad, but the break in the helix around N92 is annoying, and I'm not that convinced of the break around G15 either. I'll strengthen the helix constraints on those helices. I'll move the CreatePredAlphaCost commands into try6.costfcn, out of the score-all.under and try6.under files. I'll also tighten the "bundling" constraints, based on what residues come close in try5, and are predicted to be buried with the "near" script. Starting try6 on abyss. Thu Jul 15 16:51:16 PDT 2004 Kevin Karplus try6 scores slightly better than try5, but looks worse. We probably need to strengthen the helix constraints and loosen the bundling constraints. We might also want to try including some of George's constraints, since has has some that have a fairly high probability. I tweaked the weights of the components of the cost function in try7.costfcn, so that try5 scored better than try6 or try3. I'll try an optimization run with the try7 cost function. Fri Jul 16 07:40:25 PDT 2004 Kevin Karplus try7 breaks up the helices---I think we need to increase the weight of the helix constraints quite a bit. Burial is also awful. I think we may really want to build a physical model out of straws or something for this one, marking the conserved polars and George's contact predictions, to see if there is a way to bundle these that makes sense. Fri Jul 16 16:46:23 PDT 2004 Martina Koeva For try8 I increased the helix constraints about 3 times for each of the helices keeping the original proportions. Next step would be to actually build the physical model and try to bundle the helices from there. Mon Jul 19 10:43:32 PDT 2004 Martina Koeva I am not sure how much the increase of the helix constraint weights did for the improvement of the bundling. I think I still see breaks in the helices. For try9 I tried using the best Robetta model, since I like the way the first 4 helices in sequential order were bundled. I picked up general residue constraints from the Robetta model and used them. The result was somewhat positive. There are breaks in the helices still and some of the helices are oriented at odd angles to the rest of the bundle. I like the burial slightly better though. Mon Jul 19 13:56:07 PDT 2004 Martina Koeva For try10 I removed all of the general bundling constraints and introduced only the constraints from the RR predictions (taken from T0198.280.rr.constraints). I have taken only constraints with a probability of 0.6 or above. If these constraints seem to be working for the bundling, I will go through the rest of the list and see which ones from the ones left can be used and look believable. The constraints operator was changed to bonus_constraints and the weight on the hbond_geom_backbone was increased from 0.5 to 0.8. Tue Jul 20 15:57:34 PDT 2004 Martina Koeva I removed a couple of constraint that seemed to be causing the helices to break in the core of the protein and increased the break weight. I will wait and see the results from try11 and submit the try10-opt2.pdb file to VAST. Wed Jul 21 14:13:44 PDT 2004 Martina Koeva Ok, so try11 does not have any major breaks after I increased the weight from 20 to 50. However, as a result the helices look all messy again and nowhere close to looking like a bundle. The second helix "breaks" again around residues 51-60.And what look to be helices 1 and 2 (in sequence order) are actually predicted to be at least partially the same helix. So for try12, I have reintroduced the general constraints (keeping the high break weight) from the best Robetta model, which bundled at least the first 4 helices in a more plausible way. I am also about to start try13 on a different machine, where I have increased the weight on the helix constraints so that they are all at 20 now with the exception of the very last helix at the C-terminus. The t04 predictions from str2 and dssp-ehl2 seem to indicate a helix there, even though it's not a strong prediction. On the other hand, undertaker seems to want to fold the segment around residues D218 to K225 to a helix. BUT it's also the C-terminal end, where everything seems to look like it wants to fold into a helix. For now what I will do is introduce an additional helix constraint between residues E219 and L222 (like a turn of a normal alpha helix) and put lower weight on it (weight=10). Finally, I am submitting the structure from try11-opt2.pdb to VAST. It's not great at all, but it's worth a try. Wed Jul 21 15:45:46 PDT 2004 Martina Koeva I submitted the try11-opt2.pdb structure to VAST: ID: VS59989 Password: T0198try11 and VAST found 15 structural neighbors. The results seem quite interesting: Top hit: 2LBD (or Ligand-Binding Domain Of The Human Retinoic Acid Receptor Bound To All-Trans Retinoic Acid) Second hit: 1K4W-A (or Orphan Nuclear Receptor Ror Ligand-Binding Domain In The Active Conformation) Both of these are the ligand-binding domains of nuclear receptors - all helical proteins that do not seem to show bundling. The third structural hit is also to a structure of the ligand-binding domain of a receptor (1A28-B). I will take a look at the original papers for these structures and see whether I can find anything useful in them. But I definitely would like to use the structural alignments and incorporate them in the next runs to come (try14 and after). Which ones would be worth including? From martina@soe.ucsc.edu Wed Jul 21 15:43:59 2004 MIME-Version: 1.0 Date: Wed, 21 Jul 2004 15:43:57 -0700 (PDT) From: Martina Koeva To: Kevin Karplus cc: Martina Koeva Subject: T0198 In-Reply-To: <200407212018.i6LKI4Mf013693@cheep.cse.ucsc.edu> Hi Kevin, I have started 2 separate runs on T0198 and have submitted the last structure that I got from try11 to VAST. I am still not seeing quite what I want and I am not sure I am much closer to a bundle than before. I was wondering whether you could take a look at it later on tonight, which is when I believe try12 and try13 will be done. I decided to use some general constraints that I picked up between residues in the best Robetta model that we have for this target (model 5). In fact, I only looked at the first 4 helices, since I do not like the way Robetta folds the target into this extended conformation after the first 4 helices. Thank you! Sincerely, Martina ------------------------------------------------------------ Thu Jul 22 07:06:48 PDT 2004 Kevin Karplus I created an "unconstrained.costfcn" and used it to score all the decoys. It has somewhat low weight for clashes and breaks, since we are not at the stage of final polishing, still trying to get some sort of reasonable bundle. The best-scoring models are try13-opt2 and try11-opt2. In try13-opt2, the helix from E88-V104 is turned around so that the face predicted to be exposed is buried, and L183-I193, which is predicted to be buried has been exposed. Packing is a little loose, but not terrible. None of the top rr constraints in the rr script have been met. Try11-opt seems to be a bit looser, and has a lot of predicted buried parts exposed and also does not meet any of the rr constraints. I created an rr_bonus.costfcn that has helix constraints for 7 helices, and bonus constraints for all the rr predictions (for use with the new version of undertaker), and very weak clash and break constraints. It still likes try13 best, but try7 is now second-best. In try7, helix 88-107 is turned so that the wrong face is out, and 175-205 could use a quarter turn. Again the top rr constraints are not met. If I turn up constraints and turn off the bonus option, I should be able to drive an initial random conformation fairly quickly into something. with helices. I'll try that for try14. Thu Jul 22 14:12:45 PDT 2004 Martina Koeva Ok, so the paper for 2LBD indicates that the fold for the protein is an antiparallel alpha-helical sandwich. It's consists of 11 helices, which is 4 more than what we look like we can have in T0198. It consists of 3 layers, where the middle seems to be orthogonal to the outer two: H1 and H3 on one side, H4 ,H5, H6, H8 and H9 - in the middle, and H7,H10 and H11 on the other side. Looking at the helices in try13-opt2.pdb Thu Jul 22 15:38:07 PDT 2004 Kevin Karplus try14 has some good features, being a reasonable bundle with the rr constraints mostly met. Some of the helices have gotten a bit disordered, but I think that we might be able to tighten it up by increasing the helix constraint weights and making the rr constraints "bonus". With the try15.costfcn, try7 scores best, but if I take out the "bonus" keywords, the best becomes ... [Oops. I accidentally typed "make" without the "decoys/score-all.try15.rdb" argument, so I need to wait a bit for the main parts to be done. ] So, without the "bonus" keywords, try14 scores best, but try7 is very close behind. I'll do the optimization without the "bonus" keywords, to see if I can get try14 fixed up. From martina@soe.ucsc.edu Thu Jul 22 15:12:40 2004 MIME-Version: 1.0 Date: Thu, 22 Jul 2004 15:12:39 -0700 (PDT) From: Martina Koeva To: Kevin Karplus Subject: T0198 In-Reply-To: <200407222139.i6MLdf8V010050@cheep.cse.ucsc.edu> I am putting this right now into the README file for T0198, but the paper for the top structural hit seems to indicate an alpha-helical sandwich fold, which would consist of 3 layers of helices. The outside helical layers are parallel to each other and the middle layer is orthogonal to both. However, all those ligand-binding domains of nuclear receptors seem to have 11 helices. I can kind of see such a fold trying to form in try13-opt2, even though as you suggest the burial for helix 88-107 and 175-205 does not look good. Should I put in the alignments from VAST? If yes, which ones: only the top ones or all of the 15 alignments? I was thinking of including all of them. Given how hard it has been to fold this target into a bundle, I am inclined to believe it might be trying to fold into a helical sandwich. Either way, after I create the appropriate directories for those templates, do I need to put in anything else in them besides the edited alignment? Thanks, Martina From karplus@soe.ucsc.edu Thu Jul 22 16:12:26 2004 Date: Thu, 22 Jul 2004 16:12:25 -0700 From: Kevin Karplus To: martina@soe.ucsc.edu CC: karplus@soe.ucsc.edu In-reply-to: (message from Martina Koeva on Thu, 22 Jul 2004 15:12:39 -0700 (PDT)) Subject: Re: T0198 I think that we should simulataneously pursue 2 possibilities: the alpha sandwich (ala try13) the bundle (ala try14) Why don't you work on the sandwich while I work on the bundle? ------------------------------------------------------------ Thu Jul 22 16:25:01 PDT 2004 Kevin Karplus The best-scoring model with the unconstrained cost function is try13-opt2, so that one is definitely worth pursuing further. Thu Jul 22 20:10:24 PDT 2004 Kevin Karplus try15 polished up try14 a bit, but still had helix K175-M207 rather messed up. The conformation is better, but not quite good enough. It has moved well up in the unconstrained scoring, but is still a long way from the top. For try16, I'll try to do further polishing by strengthening the helix constraints for the helix that most needs work and removing some of the lower-probability RR constraints. Hmm, if I go to far in that direction, I move try7 and try13 to the top, but I want to work mainly on try15. Instead of removing the weaker RR constraints, I'll change them to bonus constraints (say below 0.5). After tweaking the cost components and the constraint weights until try15 barely beat try7, I started a try16 optimization run. Fri Jul 23 09:52:26 PDT 2004 Kevin Karplus try16 improves on the score with the try16.costfcn, but the helix 174-209 is still messed up. The unconstrained cost function prefers try13, try11, try16, try12, try7, try10. I just noticed that several of the "repack-nonPC" files were never created---Martina must not have been using the "make T0198.do13" command to create her decoys. I'll do make -k try7.repack try11.repack try12.repack try13.repack to repack some of the models that score well with the unconstrained costfcn. The try11, try12, and try13 costfcns prefer try13, but the try7 one prefers try7. Rosetta likes best try12-opt2.repack-nonPC. I want to create a chimeric model consisting mostly of try16, but with the final helix fixed. Positioning the helix by hand seems to be difficult (I'd have to learn a new tool like ProtoShop or Deepview), but my attempts to do superposition to get a decent possibility have not been very successful---I keep getting the wrong helices superimposing, because of the difference in how everything else packs. Undertaker does not provide adequate control of superposition. Fri Jul 23 16:58:37 PDT 2004 Kevin Karplus try17-opt2 is NOT what I wanted. the final helix did not pack neatly into the slot where I wanted it, but moved elsewhere. messing things up. Perhaps I need to get some help from Jenny to move the helix manually. Fri Jul 23 17:44:27 PDT 2004 Kevin Karplus Jenny is not around, so I'll try using a bunch of constraints, packing A180, L183, A187, I190, and I193 into the main body where I think they go. This will be try18.costfcn. Luckily it scores try17-opt2 best, so building from all existing models should do more or less the right thing. If this works, it will be our first real indication that the rr constraints are useful, since the fold recognition was coming up with the models that Martina is working on. Sat Jul 24 10:09:36 PDT 2004 Kevin Karplus Foo! The helix did NOT get properly placed in try18. Maybe I need to turn down the rr constraints and add some constraints to the neighboring helices: Constraint A180.CB L166.CG 0 4 6 7 Constraint L183.CG V162.CB 0 4 6 7 Constraint I186.CG1 V80.CB 0 4 6 7 Constraint I186.CG1 L158.CG 0 4 6 7 Constraint I186.CG1 V104.CB 0 4 6 7 Constraint I190.CG1 G83.CA 0 4 6 7 Constraint I190.CG1 A101.CB 0 4 6 7 Constraint I190.CG1 V155.CB 0 4 6 7 Constraint I193.CG1 D152.CB 0 4 8 7 Constraint I193.CG1 C97.SG 0 4 6 7 Constraint I193.CG1 A87.CB 0 4 6 7 After this placement, I'll try rotating half the structure to get the helix P76-L107 to straighten out. This may require some other constraints, as well as removing some of the current ones. Sat Jul 24 16:02:00 PDT 2004 Kevin Karplus Well, try19 didn't do what I want, so maybe I'll ignore the problem and try to straighten the broken helix. The order with the unconstrained costfcn is try13, try11, try18, try16, try12 try17, try19, try7, ... The order with the try19 costfcn is try19, try18, try17, try16, try15, try14, try13, try7, ... If I try to straighten the helix, the order becomes try7, try3, try19, ... but with judicious tweaking of the weights of the components, I can make try19 score best. I'll try running try20 to see if it can fix up try19's problems. After that, I'll probably give up on this target, selecting the best of each style to submit, even if they seem to be wrong. Maybe I should try a try21 at the same time, using an unconstrained costfcn. Sun Jul 25 08:58:31 PDT 2004 Kevin Karplus try20 looks terrible. try21 is interesting. It is densely packed and burial is good, but there are two unexpected bends in helices: around A87 and Q125. It looks like a hinging motion would straighten that out. Maybe I should add just those helix constraints to try21.costfcn Sun Jul 25 16:50:04 PDT 2004 Kevin Karplus try22 did not straighten the helices. It is also no longer a bundle, but an alpha sandwich. It scores the best with an unconstrained costfcn. The order is try22, try21, try13, try20, try11, try18, try16, try12, try17, try19, ... sandwhich: 22, 21, 13, 11, 12, bundle: 20?, 18, 16?, 17, 19? Maybe I should give up on the bundle hypothesis, or maybe we should submit this (or a better) alpha sandwich as our first model and the best bundle as our second model. Which is our best bundle? Probably try18 or try20, depending one whether you think of try20 as a bundle or not. With a helix-only cost fcn (which includes helix constraints), the order is try20, try19, try3, try8, try17, try18, try1, ... Rosetta still hates try12-opt2.repack-nonPC the least, so maybe we should include that? Perhaps we should submit try22, try20, try18, try12-opt2.repack-nonPC, try1 ? Martina will undoubtedly have some opinions on this target in tomorrow's meeting, where we'll make final decisions. (Try18 and try20 may be too similar---is there a bundle that DOESN'T break helix P76-M108?) Mon Jul 26 12:03:23 PDT 2004 Kevin Karplus Maybe try8 instead of try18? From martina@soe.ucsc.edu Mon Jul 26 16:45:46 2004 MIME-Version: 1.0 Date: Mon, 26 Jul 2004 16:45:44 -0700 (PDT) From: Martina Koeva To: Kevin Karplus cc: Martina Koeva Subject: T0198 model selection In-Reply-To: <200407260659.i6Q6xNGh017448@cheep.cse.ucsc.edu> Hi Kevin, I looked at try20 and try18 again and if we are going for diversity, I would say that it might be better to go with one of the earlier (try8) models, in which the helices are still straight. There are some minor variations/adjustments between try20 and try18, but I almost feel like they are not sufficient to justify submitting them as both model #2 and #3. There seem to be quite a few problems with try8. Even though the helix P76-M108 is not broken, it is rotated the wrong way - the buried side of the helix is exposed and vice versa. The protein does not look compact either, a bit foamy. But if the purpose of model #3 is to show a completely different packing/topology/folding, I feel like that one would fit better as our model #3. So going back to your list of models #1-#5 for submission, I would vote for: try22-opt2.pdb try20-opt2.pdb try8-opt2.pdb ## replacing try18-opt2.pdb try12-opt2.repack-nonPC.pdb try1-opt2.pdb Sincerely, Martina ------------------------------------------------------------ Sun Sep 19 08:51:22 PDT 2004 Kevin Karplus I'll use T0198 as my initial test case for evaluating the predictions. T0198 is in the template library now as 1sumB, so evaluation should be fairly straightforward. First, I'll add REAL_PDB:=1sumB to the Makefile, then create an undertaker evaluation script, similar to score-all, in casp6/starter-directory called evaluate.under. The main changes will be the addition of a real cost function (rsmd and rmsd_ca). I've also added FINAL_COSTFCN:=try23 to the Makefile, so that the evaluation can be compared with that costfcn. In terms of all-atom rmsd, of the models submitted, the order is model1, model4, model5, model2, model3, all of which are better than the robetta models, but we had MUCH better models (like try5-opt2, try2-opt1, try3-opt2, try9-opt2, try16-try7-chimera, try10-opt1-scwrl, try16-opt2.repack-nonPC, try14-opt1, try15-opt2, try16-opt1, ...) that were not submitted. The final model was not much better than the initial fully automatic model, by this measure. We may need other measures, particularly ones that mimic the evaluations used by the assessors. Wed Sep 22 05:01:41 PDT 2004 Kevin Karplus Using an undertaker implementation of the GDT score, the best model is try5-opt1 at 21% (not real impressive). The order of submitted models is robetta5 18.67%, model4 16.33%, model5 15.89%, robetta3 14.78%, model2 14.67%, robetta2 14.67%, model3 14.44%, robetta4 14%, robetta1 13.67%, model1 12.44% Neither the 21% GDT nor the 12.54 Ang RMSD are particularly impressive for try5-opt1, our best model, so this target seems to be a failure. Having our model 1 be worse than models 2,3,4,5 is also not so great. Concern: try5-opt1 and try5-opt1-scwrl, which should have identical backbones, are getting somewhat different GDT scores (21% and 19.5556%). This may be unavoidable differences due to insufficient sampling of superpositions, or it may be a serious bug in the GDT computation. I may want to look at these two models in a GDT plot, to see what is going on. Wed Sep 22 10:35:57 PDT 2004 Kevin Karplus model4 was the repacked model that scored best with the rosetta energy function, and model5 was the full auto. Fri Sep 24 12:26:27 PDT 2004 Kevin Karplus The problem with GDT was quantization artifacts. With smooth GDT, the backbones that are essentially identical have differences that are only the size of the rounding errors. name length missing_atoms cost rmsd rmsd_ca GDT smooth_GDT robetta-model5.pdb.gz 235 0.0000 0.0000 28.3215 27.6598 -21.2222 -19.8664 T0198.try3-opt2.pdb.gz 235 0.0000 0.0000 17.5179 16.9872 -20.1111 -19.2232 our best model4.ts-submitted 235 0.0000 0.0000 21.0487 20.4457 -15.5556 -15.7334 model5.ts-submitted 235 0.0000 0.0000 21.5315 20.7454 -15.8889 -15.3290 full auto robetta-model3.pdb.gz 235 0.0000 0.0000 47.5007 47.8797 -14.8889 -14.4014 model2.ts-submitted 235 0.0000 0.0000 22.1168 21.4445 -14.6667 -14.2475 model3.ts-submitted 235 0.0000 0.0000 23.4197 22.8530 -14.4444 -14.1241 robetta-model2.pdb.gz 235 0.0000 0.0000 46.2395 46.3018 -15.0000 -13.9663 robetta-model4.pdb.gz 235 0.0000 0.0000 50.3555 50.4758 -14.2222 -13.3356 robetta-model1.pdb.gz 235 0.0000 0.0000 37.8855 37.6652 -14.3333 -13.2063 model1.ts-submitted 235 0.0000 0.0000 20.7280 20.1123 -12.4444 -12.2203