Tue Jun 27 10:37:32 PDT 2006 T0349 Make started Tue Jun 27 10:39:15 PDT 2006 Running on shaw.cse.ucsc.edu Tue Jun 27 10:48:36 PDT 2006 Kevin Karplus BLAST gets only very weak hits in PDB: 1j7hA 24% over 68 residues, E-value 0.613 The protein is only 75 residues long, so may be a feasible ab initio target. Tue Jun 27 11:08:38 PDT 2006 Kevin Karplus The HMMs may be getting a hit on 1zhvA. Tue Jun 27 14:33:18 PDT 2006 Kevin Karplus Top score with the HMMs: 1zhvA E-value 0.012 Tue Jun 27 18:32:02 PDT 2006 Kevin Karplus Interesting: rosetta got a floating-point exception when trying to repack try1-opt2. Tue Jun 27 20:52:07 PDT 2006 Kevin Karplus try1-opt2 seems to agree well with the alignments and makes a fairly compact structure. Wed Jul 5 08:08:39 PDT 2006 Kevin Karplus I picked up the server models and scored them with unconstrained.costfcn. SAM_T06_server_TS1 scores best, followed by Pcons6_TS5=Pmodeller6_TS5, SP3_TS5-scwrl, ROBETTA_TS3-scwrl, ... Mon Jul 10 11:50:26 PDT 2006 Cynthia Hsu I took a look at this target in RasMol, and according to the burial script, it has residues from 11-18 and 28-43 (approximately) in exposed regions. Regions 11-18 are displayed as a helix, while residues 28-43 are represented by the coil between the first and second beta sheet. The HMM pdfs for the target indicate that the region from residues 27-30 are extremely likely to be part of a parallel sheet (P or Q). Because of this, I decided to change the SheetConstraints so that they encompassed residues 27-30, like so: SheetConstraint (T0349)L4 (T0349)T7 (T0349)R47 (T0349)I44 hbond (T0349)L5 1 SheetConstraint (T0349)I25 (T0349)L30 (T0349)V50 (T0349)P45 hbond (T0349)G26 1 //The above line changed from try1 (used to span from I25 to L28 and from V50 to R47) SheetConstraint (T0349)L28 (T0349)V29 (T0349)V48 (T0349)R47 hbond (T0349)V29 1 I also copied the rr constraints into the cost function, and added the following, since RasMol indicated that L41, currently protruding into the environment, should be closer to L13: Constraint L13.CB L41.CB -10. 7.0 14.0 0.6 Given the absence of Cysteine residues, I also set "maybe_ssbond" to 0. The three His residues that are in this molecule also appear very far apart, but I didn't want to risk excluding the possibility entirely, so I set "maybe_metal" down to 0.2. [Sat Jul 15 16:30:54 PDT 2006 Kevin Karplus maybe_metal only affects cysteine residues. ] As a future reference to whomever is eventually assigned to this protein, I've commented out "T0349.undertaker-align.pdb model 4", since it has this random exposed helix from residues 13-21 that according to both the burial and near script should not be in the environment. try2 is currently running on Whidbey. Mon Jul 10 18:01:58 PDT 2006 Crissan Harris, Cynthia Hsu In try2, a helix was formed from residues L30 to G39 that left a break and exposed a region that should not have been exposed, so we copied and pasted the helix constraints into the cost function and removed the appropriate helix constraints. We decided to see how this minor change would effect the overall structure before making any changes. try3 is currently running on lopez. Tue Jul 11 12:17:26 PDT 2006 Crissan Harris, Cynthia Hsu Straightening out the helix from residues 28-30 into a beta sheet may not have been a good idea, as it exposes regions that undertaker's burial script predicts to be buried. However, examining try2, we considered that turning residues L28 to L41 into a helix may have been an appropriate method of burying these exposed regions. With this in mind, we decided to modify try2.costfcn for try4. Because the main problem with try2 was a very large break between resides 41 and 44, we decided to raise "break" to 150. We also decided to set "phobic_fit" to 0 and "sidechain" to 3 in an effort to salvage the backbone. try4 started on whidbey Tue Jul 11 12:44:00 PDT 2006. Tue Jul 11 14:46:10 PDT 2006 Crissan Harris, Cynthia Hsu In try4, the break was sealed to about 0.3 but this caused the helix from resides L28 to L41 to form a loop that protruded into the environment. Given the minimal differences between try4.costfcn and try2.costfcn, we decided to, for try5, repeat the procedure as before but restore "phobic_fit" and "sidechain" to their original values (2 and 5, respectively). We also lowered "break" to 120, so that it would be under less pressure to close this break at the price of the structure. Because we liked the effect of the small helix from residues 31-39 on the exposure, we decided to add the helix constraints from try2-opt2.helices. We also raised the weight of the "constraints" to 12. try5 is currently running on lopez. Tue Jul 11 16:17:02 PDT 2006 Crissan Harris, Cynthia Hsu We were very pleased with the way try5 turned out. The helix from V11 to L19 actually turned slightly so that the red regions indicated by the burial script were turned more towards the interior protein than the way they had been in previous models. After looking at the unconstrained scores to see where try5 performed weaker than try2 did, we decided to do the following: raise "bad_peptide" to 15, "soft_clashes" was raised to 40, "break" was raised to 150, and "constraints" was raised to 15 to preserve the current helical, sheet, and rr constraints. We then set try6.under to read in try5-opt2 as its input. try6 is currently running on camano. Wed Jul 12 18:44:07 PDT 2006 Crissan Harris, Cynthia Hsu try6 performed the second best after try2. We copied try6.costfcn to try7 costfcn, then commented out the following: //include T0349.dssp-ehl2.constraints //include T0349.undertaker-align.sheets //include rr.constraints Besides that, we decided to lower "bad_peptide" to 13, since our original increase to 15 may have been too high. try7.under takes try6-opt2 as its input. try7 is currently running on shaw. We also thought that it might be interesting to try to replicate try2.costfcn, but with a higher weight on "break" (raised to 150). try2.under was copied as try8.under, as we decided not to do this as a polishing run. try8 is currently on orcas. Thu Jul 13 10:42:54 PDT 2006 Crissan Harris, Cynthia Hsu try7 scored the best, with the exception of try2. Its worst areas of performance were in "wet6.5", "near_backbone", "way_back", and "bad_peptide". Given that "bad_peptide" already has a relatively high weight of 15, we're hesitant to raise it even further. try9 does not exist. try8 did significantly worse than try2 did, with the only areas of improvement being in "soft_clashes" and "breaks". We were still curious as to what would happen if we raised the weight of "bad_peptide" and the dry weights, which was where this model scored the worst in comparison to try2. In view of this, "dry5" was set to 20, "dry6.5" was set to 30, and "bad_peptide" was set to 13. try10 is currently running on orcas. Thu Jul 13 11:48:44 PDT 2006 Crissan Harris, Cynthia Hsu try10 still failed to perform as well as try2. In examining and comparing try2 and try5, our favorite two models, we observed that the sheets in try5 did not seem as rigid as in try2, and decided that this may have been why polishing runs of try5 still failed to score as well as try2 did, despite the very large break. In view of this, we copied try5.costfcn to try11.costfcn, and then changed the weight of the sheet constraints from 1 to 2. We also lowered the helical constraint on L30 to G39 to 0.8, and raised the weight overall for "constraints" to 15. try11 is currently running on shaw. Thu Jul 13 14:14:35 PDT 2006 Crissan Harris, Cynthia Hsu try11 actually produced a new fold, which was very interesting. However, like all the previous models (with the exception of try2), the helix from N33 to I44 unraveled so that regions colored brown by the burial script are protruding into the environment. Examining the unconstrained scores, we noticed that try4 actually scored higher than try11, but when we compared the two models, we found that though they were very similar, we actually preferred try11, given that the loop from N33 to I44 did not protrude as much. Observing the loop in try11 in the burial script, we noticed that the alternating patterns of buried and exposed residues in this loop seemed similar to that of antiparallel beta sheets. We thought it would be interesting to add a constraint to force this loop to form two antiparallel beta sheets, and thus, in try12.costfcn, added the following: SheetConstraint (T0349)M34 (T0349)L37 (T0349)I44 (T0349)L41 hbond (T0349)S35 1 try12 is currently running on shaw. Thu Jul 13 15:18:55 PDT 2006 Crissan Harris, Cynthia Hsu The sheet constraint on the loop did not have the desired effect, but did twist it slightly. We decided to increase it to 4. We also removed the helix constraint from the previous try: //HelixConstraint (T0349)L30 (T0349)G39 0.8 try13 is currently running on shaw. Fri Jul 14 11:44:26 PDT 2006 Crissan Harris, Cynthia Hsu We discovered that try13.costfcn actually had the sheet constraints overlapping with the sheet constraints that were originally present. In view of this, we changed the added sheet constraint as follows: SheetConstraint (T0349)Q32 (T0349)S35 (T0349)G42 (T0349)G39 hbond (T0349)S35 4 try14 is currently running on bacchus. Fri Jul 14 15:19:55 PDT 2006 Crissan Harris, Cynthia Hsu try14 scored very poorly, did not form sheets as desired, and had a very large break. We reviewed the previous structures and found that our best-models had not changed since try11. We re-examined the try10 cost function, and decided to do a number of things: the weight of "constraints" overall was raised to 15, the weight on the sheet constraints were raised to 2, and the dssp-ehl2, rr constraints, and undertaker alignments were commented out. try15 is currently running on shaw. We decided to polish our favorite folds. try7 was already a very polished model of try5. We copied try3.costfcn into try16 for the purposes of polishing try3-opt2, and made the following changes: "dry5" was set to 20, "dry6.5" was set to 25, "dry8" was set to 20. "n_ca_c" was set to 7, "soft_clashes" was set to 50, "and break" was set to 120. We also commented out the dssp-ehl2 constraints, the undertaker alignments, and the rr constraints. try16 is currently running on squawk. Sat Jul 15 13:26:55 PDT 2006 Cynthia Hsu try16 scored reasonably well (better than tries 3 and 10, but worse than 2, 7, 6, and 12). In view of this, I've added it to the superimpose-best.under file, in place of try3. try15 performed significantly worse than most of the others. Given that try10 is currently the worst scoring of our favorite models, I've decided to attempt to polish it and improve its scores. It performed poorly on "dry5", "dry6.5", "dry8","n_ca_c", and "soft_clashes". for try17, I commented out the dssp constraints, the undertaker alignments, and the rr.constraints. I changed "dry5" to 25, "dry6.5" to 33, and "dry8" to 23. I raised "n_ca_c" to 7. I also raised the "constraints" weight to 15 to try to keep the sheets from twisting. "soft_clashes" was raised to 50. try17 is currently running on lopez. Sat Jul 15 16:35:05 PDT 2006 Kevin Karplus It looks like Cynthia's runs are not doing the repacking with rosetta. Perhaps there is a problem with her path? Perhaps something else, as she is getting floatin-point exceptions from rosetta. Hmm, not her problem, as I'm encoutering the same difficulty. Problem resolved---there was no return at the end of guide.a2m.gz (because there wasn't one at the end of T0349.a2m), and rosettta can't tolerate the lack of a final return. I fixed "extract-guide" to add the new line. Sat Jul 15 16:44:40 PDT 2006 Kevin Karplus I'll redo all the do1 through do17 targets (skipping the missing do9) to make the missing repacked files and see what rosetta likes. (Started on shaw) Sat Jul 15 17:07:37 PDT 2006 Kevin Karplus Rosetta likes best the gromacs0.repack-nonPC files: try4, try8, try14, try12, try1, try17, try7 Perhaps we should do an optimization run starting from any of those that are not ludicrous. try4 seems to have the first helix turned with hydrophobic pointing out. try8 seems a little better, if we believe the c-terminal strand. try14 seems similar to try8 try12 seems similar to try4. Let's do a polishing run from try8 and try14 gromacs0.repack-nonPC models. Sat Jul 15 17:18:34 PDT 2006 Kevin Karplus I made try18.costfcn to have the constraints of try8-opt2 (sheets and helices). try7-opt2 scores best with the try18 costfcn, but I'll try polishing the try8 and try14 gromacs0.repack-nonPC models with it. Sat Jul 15 17:23:12 PDT 2006 Kevin Karplus try18 started on cheep. (Note: Cynthia and Crissan could do a similar sort of polishing with models similar to try4, or with models that they favor.) Sat Jul 15 20:10:01 PDT 2006 Kevin Karplus try18-opt2 is now the second best with unconstrained costfcn, and try18-opt2.gromacs0.repack-nonPC is the best for rosetta. One of them should probably be the first model. Sun Jul 16 08:19:49 PDT 2006 Kevin Karplus I currently favor putting try2-opt2 first and try18-opt2.gromacs0.repack-nonPC second. All of our models are quite similar, in having a 4-strand sheet, though the top server models have only the first three strands. Sun Jul 16 08:30:15 PDT 2006 Kevin Karplus I just noticed that try18.cotfcn did not have constraints turned on! Perhaps I should do another run with constraints. I'll also do a run with a score function that favors the C-terminal helix models, using sheet and helix constrants from try4. try19 gets the best constraints for current gromacs0.repack-nonPC models: try4, try12, try11, try16 Sun Jul 16 08:36:03 PDT 2006 Kevin Karplus try19 started on lopez to optimize C-terminal helix models. try20.costfcn based on sheet and helix constraints from try18, gets the best constraints of the current gromacs0.repack-nonPC models in try18, try8, try5, try7, try6, try10, try3 Sun Jul 16 08:42:17 PDT 2006 Kevin Karplus try20 started on lopez to polish (with constraints) ReadConformPDB T0349.try18-opt2.gromacs0.repack-nonPC.pdb.gz ReadConformPDB T0349.try8-opt2.gromacs0.repack-nonPC.pdb.gz ReadConformPDB T0349.try5-opt2.gromacs0.repack-nonPC.pdb.gz ReadConformPDB T0349.try7-opt2.gromacs0.repack-nonPC.pdb.gz ReadConformPDB T0349.try6-opt2.gromacs0.repack-nonPC.pdb.gz ReadConformPDB T0349.try14-opt2.gromacs0.repack-nonPC.pdb.gz This may be unable to do much, as there may be too much helix in try18, stopping the break from being closed. Sun Jul 16 09:39:13 PDT 2006 Kevin Karplus try20-opt2 does score better than try18-opt2 (with the try20 costfcn), but rosetta still likes try18 better (though try20 is considered second best). try19-opt2 scores best with the try19 costfcn, but rosetta still likes try4 more. With the unconstrained costfcn, the best are now try20-opt2 try2-opt2 try18-opt2 try19-opt2 try20-opt2 try20 has a horrible break before L30, which gromacs does not noticeably reduce---it is essentially the same as try18 here. try2 has a horrible break before V43. try19 has no bad breaks, but it has some clashes that cause gromacs to introduce a break before I36 when fixing the clashes. The whole loop from V29-V43 seems difficult to predict---our models are all over the place. There seem to be two main prediction decisions: what to do at the Cterminus strand: try20, try18, try2, try7, try16, try10 helix: try19, try4, try11 what to do with the loop V29-P45. hairpin: try11, try4, try19 kink and helix: try20, try18, try7 inline helix: try2 I'd like to make a chimera of try2 and try4, taking the hairpin from try4 and the rest from try2, but the sheets are different phases: try2: SheetConstraint M1 N8 H51 I44 hbond R2 1 SheetConstraint R2 T7 S73 A68 hbond E3 1 C-terminal strand SheetConstraint L28 D31 V48 P45 hbond V29 1 try4: SheetConstraint M1 L4 H51 V48 hbond R2 1 matches SheetConstraint L4 R6 L49 R47 hbond L5 1 twisted SheetConstraint H27 Q32 V50 P45 hbond L28 1 off-by-1 from try2 I will submit try20-opt2 # best unconstrained try2-opt2 # 2nd best unconstrained try19-opt2 # best with c-terminal helix try4-opt2.gromacs0.repack.nonPC.pdb # different loop with C-terminal helix # 3rd best rosetta try11-opt2 # yet another hairpin-like loop Sun Jul 16 10:34:24 PDT 2006 Kevin Karplus So submitted. E-mail me if changes are needed. Mon Jul 17 13:11:47 PDT 2006 Crissan Harris, Cynthia Hsu The models submitted seem to agree with our top choices. Mon Aug 21 16:01:48 PDT 2006 Kevin Karplus Our best model was try4-opt1-scwrl, but model4 (try4-opt2.gromacs0.repack-nonPC) was close. Of course, this assessment is pretty much junk, since only part of the NMR model is the target. The "correct" model seems to have been trashed a bit in T0349.model1-real.pdb, because the NMR model in 2hfvA is longer on the N-terminus than the sequence given to us and the crude alignment strategy has picked out the wrong residues to align to. We should fix the alignment of PDB files to the target sequence in undertaker, and redo this evaluation.