Tue Jun 27 10:37:27 PDT 2006 T0347 Make started Tue Jun 27 10:38:31 PDT 2006 Running on lopez.cse.ucsc.edu BLAST gets only weak hits in pdb: 1epuA 28% over 79 residues, E-value 0.019 Tue Jun 27 15:11:45 PDT 2006 Kevin Karplus The HMMs get a modest hit: 1vk1A (d.268.1.2) with E-value 0.0016 This is not one that BLAST saw. Tue Jun 27 18:37:36 PDT 2006 Kevin Karplus try1-opt2 seems to have an N-terminal domain that matches the alignments well, but it falls apart after G95. Wed Jul 5 08:04:00 PDT 2006 Kevin Karplus Scored the server models with unconstrained.costfcn. SAM_T06_server_TS1 scores best (better than try1-opt2). Next is ROBETTA_TS4, ROBETTA_TS2, ... Sat Jul 15 13:22:44 PDT 2006 Kevin Karplus I'd better try a subdomain prediction, starting with G95. G95-E205 make started on lopez. Sat Jul 15 21:28:17 PDT 2006 Kevin Karplus G95-E205/ alignments do not agree with each other, which is not too surprising as the top hit (1xb4A) has Evalue 2.5 and the alignment to it is tiny fragments. G95-E205/try1-opt2 comes out as a 4-helix bundle, pesumably from 2lbd. It is at least somewhat plausible, so perhaps we should make a chimera of it with try1-opt2 from the whole chain. It might need a bit of ProteinShop movement to put the domain so that it covers some of the hydrophobics of the first domain. I don't know if I have the energy to do this tonight though (I don't have ProteinShop at home), and Zack is not available this weekend. Sun Jul 16 10:57:48 PDT 2006 Kevin Karplus I'm trying a new G95-E205 run, adding in the following constraints: # Strongly predicted Hbonds (from n_sep, o_sep, n_notor2, o_notor2) Hbond D113.N L117.O 1 Hbond G116.N D113.O 1 Hbond R119.N P111.O 1 SheetConstraint P111 D113 R119 L117 2 Hbond G148.N L143.O 1 Sun Jul 16 10:59:49 PDT 2006 Kevin Karplus G95-E205/try2 started on cheep. Sun Jul 16 11:25:32 PDT 2006 Kevin Karplus G95-E205/try2-opt1 is getting the try2 constraints (both the hairpin and the helices), but the helices are not all packed against each other compactly. I may need to ProteinShop the domain to bury the hydrophobics. G95-E205/try2-opt2 is similar---the 2 c-terminal helices look pretty good and the hairpin looks good, but the first three helices do not look like they have packed right. Rosetta likes try2 better than try1. I'll try making a chimera of try1-opt2 and G95-E205/try2-opt2. Sun Jul 16 12:07:25 PDT 2006 Kevin Karplus Rats, I'm going to have to reshape the C-terminal domain before I can superimpose without terrible clashes. This will almost certainly mean using ProteinShop. I could just put the two domains into a chain with a big separation, and let undertaker pull them together. I haven't tried that for a while. Sun Jul 16 12:29:29 PDT 2006 Kevin Karplus Before trying to get a workable chimera, it may be worth trying to get the C-terminal domain better packed first. I'm doing G95-E205/try3 with stronger packing terms and a couple of extra constraints, to see if that improves packing. Sun Jul 16 13:58:48 PDT 2006 Kevin Karplus G95-E205/try3 does not look very useful. It is almost the same as G95-E205/try1 (though it does add the tiny hairpin), but it has terrible breaks. Even try2 looks better to the try3 costfcn. I'll try polishing G95-E205/try2 with the try3 costfcn, and see if that accomplishes anything. Sun Jul 16 14:02:19 PDT 2006 Kevin Karplus G95-E205/try4 started on cheep. Sun Jul 16 15:12:47 PDT 2006 Kevin Karplus G95-E205/try4 scores better on the try3=try4 costfcn and unconstrained, but rosetta still prefers try2. G95-E205 has try1 = approx try3 try2 = approx try4. I think I may have to Proteinshop this protein to make any progress. I may have some time to do that in the morning. Mon Jul 17 12:13:36 PDT 2006 Zack Sanborn I've begun the process of stitching the models G95-E205/try4 and residues 1-95 from try1-opt2 of the full model together using ProteinShop. I've made a ProteinShop directory and put the two models that I want to put together in there. I believe I have a good idea of how to put them together since they both have what look like hydrophobic faces that can be placed next to one another. Now, it's just a matter of putting them close enough in ProteinShop for Undertaker to optimize it nicely. Mon Jul 17 13:20:33 PDT 2006 Zack Sanborn I've put together the two models: all of G95-E205/try4, and residues 1-94 from try1-opt2 of the full model, making the chimera in a file called 'decoys/T0347.chimera-renum.pdb.gz'. The orientation may not be perfect, since currently there is a hydrophobic side and a hydrophilic side to the model. However, it's a nice enough structure that I'll try to optimize it with Undertaker. I've started try2, optimizing the chimera on camano. While this is running, I will play with the orientation a little more and see if I can bury those hydrophobics a little better. Mon Jul 17 16:07:02 PDT 2006 Zack Sanborn At the targets meeting, Kevin made it clear that he wanted the subdomain G95-E205/try4 packed better. He wanted to make a bundle of helices with that tiny hairpin preserved. Unforunately, the quickest way to do this was to use ProteinShop. I've made two models that pack the helices in a better way. The models are similar to one another as one was merely a backup on the way to the second model. The second model (T0347.t4o2.model2.pdb) has the best packing of hydrophobics. I've started an optimization of the second model (G95-E205/try5). It is hoped that we can stitch this domain together to the residues 1-94 of the try1-opt2 full model like above. While this is running... I'm going to try a make another model with a slightly altered topology. It will be good to have some options. Mon Jul 17 17:01:37 PDT 2006 Zack Sanborn try2 has finished, this being the full model optimization of the chimera I made of G95-E205/try4 with try1-opt2. It is the best scoring model by a little bit, and it looks like Undertaker was able to resolve the chimera's soft clashes and breaks pretty well. try2-opt2 is also the best-scoring model with an unconstrained cost function. Unfortunately, it still has a hydrophobic patch on one side that I would like to try to cover up better. So, I'm using ProteinShop to make another chimera using the same ingredients but will try to orient them in such a way as to cover the hydrophobic patch better. Mon Jul 17 17:40:51 PDT 2006 Zack Sanborn try5 has finished. It isn't the top-scoring model, but it's in the top 5, which I think is fairly good for such a heavily ProteinShop'd model. I'm waiting for try6 to finish since it has a more natural topology (up-down- up-down-up in a circle) than try5. Undertaker took the first helix (G1-L14) of the ProteinShop model and twisted it away from the bundle. This exposes some hydrophobic residues. Depending on how try6 turns out, I may want to use some constraints to try to pin that helix down near the others. Also, I've set up a try3 for the full model with a new ProteinShop'd chimera of G95-E205/try4 and 1-94 of try1-opt2. Unfortunately, I'm having a hard time finding a machine to start it on. I'll keep searching/waiting for one to open up. Mon Jul 17 18:06:14 PDT 2006 Zack Sanborn G95-E205/try6 has finished and it REALLY blew up. So, I'm going to have to redo both try5 and try6 and place constraints on the models to keep them in a bundle. Kevin suggests using the "good" constraints found from the Undertaker- optimized version of the ProteinShop models. Then, use residue constraints (CB-to-CB) to contain the helix bundle. Getting the distance from Rasmol using 'select distance' and turning on backbone, selecting CB's and showing them as spacefill will help get the correct distances. I will work on this later tonight. Mon Jul 17 20:12:39 PDT 2006 Zack Sanborn Well, what I meant by "later tonight" was "tomorrow"... I HAVE, however, updated the superimpose-best.under and best-models.pdb.gz with the "best" models of T0347. Unfortunately, these models are mostly just terrible. The highest scoring model is try3-opt2, second best is try2-opt2, and third best is try1-opt2 (ignoring, like usual, the variations of each of these models .repack, .gromacs0, etc). So, at least the models were getting consistently better! The models that I propose to submit for the soft deadline tomorrow are: T0347.try3-opt2.pdb T0347.try2-opt2.pdb T0347.try3-opt2.repack-nonPC.pdb.gz T0347.try1-opt2.pdb T0347.undertaker-align.pdb model 1 And, a quick look at best-models.pdb.gz shows that these models (with the exception of try3-opt2 and try3-opt2.repack-nonPC.pdb, which are very nearly the same) are all very different. Just what we want for an ab initio. Kevin may want to change things up a bit, though. Obviously, much more work is necessary, but I believe our optimized models for the subdomain (95-205) will make some better total models... that is, once we make some good models for that subdomain. Mon Jul 17 20:40:39 PDT 2006 Kevin Karplus I substituted rosetta's favorite repacking, and submitted with the preliminary comment: For the preliminary submission, we really only have fold-recognition models for the N-terminal domain, as the C-terminal domain needs new-fold techniques and we have not yet gotten a compact doman for it. We expect to use a combination of undertaker and ProteinShop to produce the C-terminal domain, and hope to have it done by the hard deadline. For now, we have only undertaker-generated models: Model 1 is T0347.try3-opt2, our best-scoring. Model 2 is T0347.try2-opt2. Model 3 is T0347.try2-opt2.gromacs0.repack-nonPC, which is model 2 reoptimized with gromacs then with sidechains repacked by rosetta. It is rosetta's favorite of the models it has repacked. Model 4 is T0347.try1-opt2, the fully automatic model. Model 5 is sidechain replacement by SCWRL on an alignment to 1vk1A. Wed Jul 19 11:27:22 PDT 2006 Zack Sanborn Well, today, I'm going to work on optimizing the G95-E205 subdomain better. If we remember, the domains were ProteinShop'd into two topologies of helix bundles (one "unnatural" model2.pdb, and one "natural" (up-down-up..) model3.pdb). Unfortunately, when Undertaker saw the models and their clashes, it blew the models up. The model2 fared better, with only a single helix kicked out into space. However, model3 looks nothing like the desired model. So, I'm going to re-optimize these models with Undertaker, but will place some strong residue constraints to hold down those helices. For model2, I have the following CB-to-CB distances (found using Rasmol): PHE99.CB-ASP172.CB: 6.237 ILE128.CB-LYS96.CB: 4.650 ARG106.CB-PHE164.CB: 4.653 ASN107.CB-TRP161.CB: 3.790 ASP104.CB-SER121.CB: 1.966 ASN107.CB-ALA146.CB: 8.850 GLN120.CB-ALA146.CB: 7.245 ASP133.CB-SER177.CB: 4.151 LEU143.CB-LEU188.CB: 4.226 ALA146.CB-ARG192.CB: 6.68 TYR149.CB-ARG195.CB: 7.745 PRO155.CB-HIS196.CB: 3.375 PHE159.CB-GLU193.CB: 3.054 ALA162.CB-LYS190.CB: 4.039 For model3, I have the following CB-to-CB distances: GLU98.CB-LEU131.CB: 4.848 GLU98.CB-ALA181.CB: 3.489 VAL102.CB-ILE124.CB: 4.985 HIS105.CB-ASP123.CB: 4.884 ARG106.CB-ARG192.CB: 2.902 SER121.CB-ARG144.CB: 2.975 PRO125.CB-ALA140.CB: 2.364 PRO135.CB-ASP172.CB: 6.473 GLU132.CB-PHE178.CB: 4.222 PRO135.CB-ARG168.CB: 5.155 LEU165.CB-MET186.CB: 3.482 GLU158.CB-GLU193.CB: 2.389 ALA150.CB-HIS196.CB: 3.554 VAL152.CB-LEU197.CB: 4.565 Obviously some of these distances will have to be relaxed substantially, otherwise it's clash city! Wed Jul 19 14:29:18 PDT 2006 Zack Sanborn From these CB distances, I came up with the following residue constraints: For model2 (renumbered for the subdomain): Constraint F5.CB D78.CB -10. 6.2 10.0 1 Constraint I34.CB K2.CB -10. 5.0 10.0 1 Constraint R12.CB F70.CB -10. 5.0 10.0 1 Constraint N13.CB W67.CB -10. 5.0 10.0 1 Constraint Q26.CB A52.CB -10. 8.0 10.0 1 Constraint D39.CB S83.CB -10. 7.2 10.0 1 Constraint L49.CB L94.CB -10. 5.0 10.0 1 Constraint A52.CB R98.CB -10. 6.7 10.0 1 Constraint Y55.CB R101.CB -10. 7.7 10.0 1 Constraint P61.CB H102.CB -10. 5.0 10.0 1 Constraint F65.CB E99.CB -10. 5.0 10.0 1 Constraint A68.CB K96.CB -10. 5.0 10.0 1 For model3 (renumbered for the subdomain): Constraint E4.CB L37.CB -10. 5.0 10.0 1 Constraint E4.CB A87.CB -10. 5.0 10.0 1 Constraint V8.CB I30.CB -10. 5.0 10.0 1 Constraint H11.CB D29.CB -10. 5.0 10.0 1 Constraint R12.CB R98.CB -10. 5.0 10.0 1 Constraint S27.CB R50.CB -10. 5.0 10.0 1 Constraint P31.CB A46.CB -10. 5.0 10.0 1 Constraint P41.CB D78.CB -10. 6.4 10.0 1 Constraint E38.CB F84.CB -10. 5.0 10.0 1 Constraint P41.CB R74.CB -10. 5.2 10.0 1 Constraint L71.CB M92.CB -10. 5.0 10.0 1 Constraint E64.CB E99.CB -10. 5.0 10.0 1 Constraint A56.CB H102.CB -10. 5.0 10.0 1 Constraint V58.CB L103.CB -10. 5.0 10.0 1 I chose the desired CB-to-CB residue constraint to be a minimum of 5.0, that should allow undertaker to expand the model enough while still keeping it fairly well-packed. Also, the maximum distance allowed I set to be 10.0. This might have to be increased if the models score horribly. I've also added the sheet and helix constraints from the Undertaker-optimized ProteinShop models. However, I had to use the sheet constraint from try5-opt2 in both cost functions since try6-opt2 lost its sheet, so had no constraint. try7, optimizing T0347.t4o2.model2.pdb with the above residue constraints and the try5-opt2 sheet & helix constraints, was started on vashon. try8, optimizing T0347.t4o2.model3.pdb with the above residue constraints and the try5-opt2 sheet constraints and try6-opt2 helix constraints, was started on camano. Wed Jul 19 16:38:31 PDT 2006 Zack Sanborn Both G95-E205/try7 and G95-E205/try8 have finished and the residue constraints worked! Neither scores very well, mostly due to the introduction of some severe breaks and soft clashes, but I think a little polishing will help bring these costs down. On the plus side, both models do a better job packing the hydrophobics. The worse breaks are as follows: T0347.try7-opt2.pdb.gz breaks before (T0347)F159 with cost 2.42071 T0347.try7-opt2.pdb.gz breaks before (T0347)E132 with cost 1.64626 T0347.try7-opt2.pdb.gz breaks before (T0347)I109 with cost 1.13655 T0347.try8-opt2.pdb.gz breaks before (T0347)V152 with cost 5.49699 T0347.try8-opt2.pdb.gz breaks before (T0347)R119 with cost 2.33279 T0347.try8-opt2.pdb.gz breaks before (T0347)P111 with cost 1.58665 So, it's clear that try8-opt2 needs a bit of work. The gromacs optimized structure does better... So, I'll try optimizing it and see where we go with it. try9, optimizing try7-opt2.pdb.gz increasing the penalties for breaks and soft_clashes, was started on orcas try10, optimizing try8-opt2.gromacs0.pdb.gz increasing the penalties for breaks and soft_clashes. Fri Jul 21 12:49:20 PDT 2006 Zack Sanborn try9 significantly improved soft_clashes and breaks over try7-opt2, which the run was based on. However, try9-opt2 scores only third best with the unconstrained cost function. It loses points for the costs 'n_ca_c' and 'bad_peptide'... I believe this means that this model has a strained backbone, but not exactly sure. However, try9-opt2 scores very well in phobic_fit (as did try7-opt2) try10-opt2 (which optimized try8-opt2.gromacs0.pdb) scores 12th, below even try1-opt2.pdb. It was optimizing a GROMACS model, so that may have something to do with it. It might be worth optimizing try10-opt2 to improve on where it fails (n_ca_c, bad_peptide and side_chain). Actually, both of these models appear to be having the same problems (n_ca_c, bad_peptide, and side_chain), so I'm going to do another round of optimizations that increase these costs significantly. try11, optimizing try9-opt2, was started on camano. try12, optimizing try10-opt2, was started on orcas. Wed Jul 26 12:41:00 PDT 2006 Zack Sanborn Well, the try11 and try12 runs did improve the models slightly in the areas we wanted to (n_ca_c, bad_peptide, and side_chain). Still, these models cannot surpass try4-opt2 with the unconstrained cost function. I also updated the grep-best-rosetta. Rosetta loves the following models: T0347.try2-opt2.gromacs0.repack-nonPC.pdb.gz -153.1 T0347.try4-opt2.gromacs0.repack-nonPC.pdb.gz -142.3 T0347.try1-opt2.gromacs0.repack-nonPC.pdb.gz -131.6 T0347.try6-opt2.gromacs0.repack-nonPC.pdb.gz -109.2 T0347.try9-opt2.gromacs0.repack-nonPC.pdb.gz -96.7 T0347.try11-opt2.gromacs0.repack-nonPC.pdb.gz -83.4 T0347.try12-opt2.gromacs0.repack-nonPC.pdb.gz -68.5 All other models have positive Rosetta scores. I'm happy to see try11 and try12 show up in the above list, but I'm sorry to see that try1-opt2 scores signficantly better! Ah well. I think I've done all I can do with the subdomain. The next step is to start putting this subdomain together with the rest of the protein and start optimizing the full protein. Unfortunately, I can't think of a way of doing this without ProteinShop. Thu Jul 27 05:33:28 PDT 2006 Kevin Karplus You can superimpose the subdomain on a whole-chain model, emphasizing CB atoms on the part of the chain where you want to do the crossover. If the models are compatible, cutting and pasting between the superimposed models can make a chimera. You can alos make a chimera without trying to line up the models, and hope that undertaker's gap closing will do it for you. This is sometimes the best way when a superposition would result in bad clashes. You want your chimera to have few clashes, since damage to the subparts could result from trying to resolve the clashes before the parts are in their final orientations. Fri Jul 28 10:53:54 PDT 2006 Zack Sanborn Thanks Kevin. Actually, I thought of doing this second part already. I've made a chimera (T0347.chimera3_renum.pdb) that takes residues 1-94 of our try3-opt2 model with G95-E205/try11-opt2 (our highest scoring model with a helix bundle). Thankfully, the two models when put into a chimera don't overlap (but there is a big gap). I'm hoping a quick optimization of the chimera, turning up breaks will be enough to close this gap and make a good looking protein. If the orientation is not exactly perfect, we could then use ProteinShop to tweak that domain. try4, optimizing T0347.chimera3_renum.pdb, was started on vashon. I have also made a chimera of the first domain of try3-opt2 (residues 1-94) and G95-E205/try12-opt2 (our second highest scoring model with the helix bundle, this bundle is the more "normal" one). The chimera is found in decoys/T0347.chimera4_renum.pdb.gz. This one has an even LARGER break, but I'm still hoping that Undertaker manages to close the gap in a satisfactory way. try5, optimizing T0347.chimera4_renum.pdb, was started on lopez. Sun Jul 30 17:15:01 PDT 2006 Zack Sanborn Both try's optimizing chimeras have finished. Undertaker was able to put together the "domains" in a satisfactory way. In fact, try4-opt2 and try5-opt2 are the top-scoring models with an unconstrained cost function. However, looking at decoys/grep-best-rosetta, we see that Rosetta doesn't quite like them as much as try2-opt2 and try3-opt2. try4-opt2 and try5-opt2 score 3rd and 4th best, respectively, with small positive scores (14.0, 27.9). Both models have similar amount of breaks (~30), with only about 10 being anything serious to worry about. The GROMACS optimized version have about half as many breaks and most are smaller. So, I'm going to start a couple of optimization from the gromacs0 models for try4-opt2 and try5-opt2 to see if we can clear up the breaks. try6, optimizing try4-opt2.gromacs0, started on lopez. try7, optimizing try5-opt2.gromacs0, started on whidbey. Mon Jul 31 12:59:29 PDT 2006 Zack Sanborn try6 and try7 don't score too well with an unconstrained cost function with Undertaker, BUT Rosetta likes both of them significantly better than try4 and try5. Undertaker doesn't like try6 and try7 due to n_ca_c and side_chain. It might be worth taking these models and upping the costs for both of these to see if we can get some models that both Undertaker and Rosetta like. Mon Jul 31 16:19:19 PDT 2006 Zack Sanborn try8 and try9 have finished. try8-opt2 is now the highest scoring model with an unconstrained cost function, which is good. However, Kevin mentioned that turning up the costs that I turned up for try8 and try9 (specifically, sidechain and n_ca_c) is not the greatest thing to do. Since these trys still have some bad breaks, it'd be smarter to try to improve on breaks and soft_clashes before moving on. try10, optimizing from all models, was started on vashon. The dry weights, soft_clashes and breaks were increased as per Kevin's suggestions. Mon Jul 31 17:26:40 PDT 2006 Zack Sanborn I'm going to predict that the following models will be submitted: T0347.try10-opt2.pdb (highest scoring model < try8-opt2) T0347.try8-opt2.pdb (2nd best < C-term domain + 1st helix bundle) T0347.try5-opt2.pdb (3rd best < C-term domain + 2nd helix bundle) T0347.try3-opt2.pdb (model where we got C-term domain) T0347.try2-opt2.gromacs0.repack-nonPC.pdb.gz (highest Rosetta scoring model) Still waiting for try10 to finish before letting Kevin know. As of 17:43, it's about past the halfway point. Probably another hour before it's done. I've updated the superimpose-best.under with these suggestions (can't run it quite yet, though!). Mon Jul 31 18:21:00 PDT 2006 Zack Sanborn Updated the T0347.method file with the above suggestions. Mon Jul 31 19:22:39 PDT 2006 Kevin Karplus try10-opt2 is the top-scoring model with the try10 costfcn, but rosetta likes try2, try3, try7, and try6 better. The try9 costfcn orders them try8, try9, try10, try4, try2, try3 unconstrained.costfcn orders them try10, try8, try4, try5, try3, try9, try2 Why is try5 included an not try4 (perhaps a bigger difference from other models)? Mon Jul 31 19:56:44 PDT 2006 Kevin Karplus I've made the submission with comment We optimized the G95-E205 subdomain separately (the first domain seemed well modeled in the whole-chain predictions). We ended up using Proteinshop to build some helical bundles for the C-terminal domain, then optimized them with undertaker. We had to put some strong constraints on the bundles initially, to keep the the bundle together until the clashes were resolved. Model 1 is T0347.try10-opt2, our best-scoring model, made from optimizing try8-opt2 with undertaker. Model 2 is T0347.try8-opt2, our second best scoring model, made from optimizing try6-opt2 < try4-opt2 < chimera of residues 1-94 from try3-opt2 and G95-E205/try11-opt2. G95-E025/try11-opt2 < G95-E025/try9-opt2 < G95-E025/try7-opt2 < proteinshopped bundle. Model 3 is T0347.try5-opt2, the best scoring of those models optimized from chimera4, which was built from 1-94 of try3-opt2 and G95-E205/try12-opt2. G95-E025/try12-opt2 < G95-E025/try10-opt2 < G95-E025/try8-opt2.gromacs0 < G95-E205/try6-opt2 < different proteinshopped bundle. Model 4 is T0347.try3-opt2, an optimization of chimera2, which used 1-94 of the fully automatic try1-opt2 and G95-E205/try4-opt2. G95-E205/try4-opt2 < G95-E205/try2-opt2 < G95-E205 alignments. Model 5 is T0347.try2-opt2.gromacs0.repack-nonPC.pdb, which rosetta liked best of all the backbones it repacked sidechains for. try2-opt2 was optimized by undertaker from a chimera of 1-94 from the fully automatic try1-opt2 and G95-E205/try4-opt2. The undertaker output was then reoptimized by gromacs to remove small clashes and the sidechains were repacked by rosetta. (The comment that Zack had left in the T0347.method file was far too skimpy---this one is still to short, but I don't have time to dig through the README file and the log files any more, trying to reconstruct how the models were built.)