Thu Jun 1 09:05:03 PDT 2006 T0309 Make started Thu Jun 1 09:21:24 PDT 2006 Running on lopez.cse.ucsc.edu Thu Jun 1 09:30:06 PDT 2006 Kevin Karplus BLAST finds modest hits 2bdqA (93% id over 14 residues) 1tf7A (31% over 77 residues---but there are only 76 residues!!) 2gxfA, 2bdtA, 2afcA, 1xklA (100% id over 13 residues) The short matches seem to be to the HIS tag---not real useful. This looks like a fairly easy fold-recognition though, as the 1tf7A hit is essentially full length. The t06 multiple alignment finds nothing but the B.subtilis protein itself (minus the 13-residue HIS tag: MAGDPLEHHHHHH), so the BLAST hit was too distant to be found in NR. Thu Jun 1 21:40:03 PDT 2006 Kevin Karplus I don't like the structure in try1-opt2, as it has used the HIS tag as part of a sheet. I think we should try predicting a subdomain, M1-E63, which excludes the HIS tag. Fri Jun 2 08:07:50 PDT 2006 Kevin Karplus The M1-E63 model seems a bit foamy, but otherwise ok. We may need to tuck F42 into the interior. Upping the dry packing terms (including phobic_fit) may help pack this tighter. We might also want to try including a couple of the linker residues from the HIS tag, so that we have better secondary structure prediction for the KGVE residues at the end of the native protein. Sat Jun 17 02:04:45 PDT 2006 Kevin Karplus The M1-G66/try1-opt2 is quite similar to the M1-E63/try1-opt2, but the G66 is buried inside, making it inaccessible for adding the HIS tag. There is also a difference is alignment of the sheet: # from M1-G66/decoys/try1-opt2.sheets SheetConstraint F14 F15 T50 I49 hbond F14 1 # from M1-E63/decoys/try1-opt2.sheets SheetConstraint N10 F15 K52 V47 hbond K12 1 I tried making a chimera of M1-G66 try1-opt2 and undertaker-align model2 (which has the HIS tag from D67 on in a reasonable place). This is chimera1 with the crossover between G66 and D67. There is a bad break, but we may be able to fix that. I also made chimera2 from M1-G66 try1-opt2 and undertaker-align crossing over between V62 and E63. I also made chimera3 from M1-E63 try1-opt2 and undertaker-align crossing over between V62 and E63. try2 & try3.costfcn uses the sheet constraints from M1-G66 try4.costfcn uses the sheet constraints from M1-E63 Try2 will optimize chimera1 (on lopez) Try3 will optimize chimera2 (on shaw) Try4 will optimize chimera3 (on shaw) Sat Jun 17 02:43:47 PDT 2006 Kevin Karplus It looks like the gaps will be closed fairly quickly in all three runs, and the final models will be at least as good as the try1-opt2 model that buried the HIS TAG in the sheet. Sat Jun 17 08:12:57 PDT 2006 Kevin Karplus Although try3-opt2 scores better than try4-opt2, I like the sheets of try4-opt2 better. We might be able to improve the packing of try4 by adding Hbond E63.N P58.O, which seems to be trying to form. The break before E63 did not close in try4-opt2, because V62 is buried too deep. Perhaps another chimera---one with the guts of try4 but the N- and c-termini from try3-opt2? Hmm, I'm not sure whether I like the N-terminus from try3 or try4 better---try3 buries M1, but try4 solvates S3.OG. The burial of M1 results in some pretty bad clashes, so maybe I should stick with try4's N-terminus. Actually, try2's N-terminus is also a reasonable option. Take M1-V6 from try2, H7-K60 from try4, and G61-H76 from try3. This chimera (chimera4) does not have the possible E63.N P58.O Hbond. In fact, E63 looks too buried. Perhaps optimization will free that up a bit. Sat Jun 17 08:55:13 PDT 2006 Kevin Karplus optimizing chimera3 and chimera4 as try5 on shaw Sat Jun 17 09:21:15 PDT 2006 Kevin Karplus Interestingly, chimera3 seems to have been the model picked by try5 to improve. I'll start a try6 with the same costfcn as try5, but with just chimera4 as a starting point. Sat Jun 17 09:23:48 PDT 2006 Kevin Karplus try6 started on lopez Sat Jun 17 10:18:26 PDT 2006 Kevin Karplus history to keep track of where models came from: M1-G66 -> chimera1, chimera2 -> try2, try3 M1-E63 -> chimera3 -> try4, try5 try4 -> chimera4 -> try6 Sat Jun 17 10:30:21 PDT 2006 Kevin Karplus try6-opt2 and try5-opt2 are the two top-scoring models with try6 (=try5=try4) costfcn. They are also the best with the unconstrained costfcn. Both undertaker and rosetta prefer try6 to try5. I'm reasonably happy with both. I'll do a polishing run with breaks and clashes turned up, starting from all our models. Sat Jun 17 10:44:49 PDT 2006 Kevin Karplus try7 started on lopez Sat Jun 17 13:03:02 PDT 2006 Kevin Karplus try7-opt2 is best scoring with try7 costfcn, and rosetta likes repacking it best of any of the models. It seems to be based on try6, though there may have been some crossover. I think we've reached the point of dimishing returns here. I'll submit ReadConformPDB T0309.try7-opt2.pdb ReadConformPDB T0309.try5-opt2.pdb ReadConformPDB T0309.try3-opt2.pdb ReadConformPDB T0309.try4-opt2.pdb ReadConformPDB T0309.try2-opt2.pdb History to keep track of where models came from: M1-G66 -> chimera1-> try2 M1-G66 -> chimera2 -> try3 M1-E63 -> chimera3 -> try4, try5 try4 -> chimera4 -> try6 -> try7 Mon Jul 3 12:59:49 PDT 2006 Kevin Karplus It may be worthwhile to do a polishing run with breaks and clashes turned up starting only from the gromacs models (to escape from local minima). try8 started on farm cluster. Mon Jul 3 16:47:26 PDT 2006 Kevin Karplus try8 greatly reduced breaks and clashes, but is still pretty foamy. Rosetta like it best of any of the backbones for repacking. Mon Jul 3 17:01:47 PDT 2006 Kevin Karplus Trying one more polishing run from all the gromacs optimized models, in the hope of beating try8-opt2. After that I should probably do a polishing run from all models. (try9 started on farm cluster) Tue Jul 4 15:06:48 PDT 2006 Kevin Karplus For some reason, gromacs is not running on the farm cluster, so I reran it on cheep. I also tried making try8-opt2.gromacs0.repack-nonPC (and try9...). Currently, the best score with try9.costcn is T0309.try9-opt1 (not opt2, as there are some larger breaks in try2, apparently). Rosetta likes best decoys/T0309.try9-opt1.gromacs0.repack-nonPC.pdb.gz decoys/T0309.try8-opt2.gromacs0.repack-nonPC.pdb.gz decoys/T0309.try9-opt2.gromacs0.repack-nonPC.pdb.gz Tue Jul 4 15:27:34 PDT 2006 Kevin Karplus I'll do one more polishing run (try10) on the farm cluster, but I think we've reached the point of diminishing returns---we're unlikely to make the model better with further refinement. Tue Jul 4 17:25:45 PDT 2006 Kevin Karplus I will submit with the following comments: T0309 is an ORFan---we found no similar sequences in NR, even with our most-sensitive iterated searches. This makes all of our predictions methods (neural nets, HMMs, ...) much less accurate, as we have no signal from evolutionary sampling of the fold. Our first model had a serious flaw---it included the HIS tag in a sheet. Under the assumption that the structure is formed without the HIS tag, we made two subdomain predictions M1-E63 and M1-G66. HIS tags were pasted onto the automatically generated subdomain models, and the resulting chimeras optimized. Closing the gaps was a bit hard, as the subdomain models had not necessarily left the C-terminus on the surface. We also tried taking the sheet from one of the optimizations, and adding the HIS tag from one of the others. history to keep track of where models came from: M1-G66 -> chimera1-> try2 M1-G66 -> chimera2 -> try3 M1-E63 -> chimera3 -> try4, try5 try4 -> chimera4 -> try6 -> try7 try5-opt.gromacs0 -> try8-opt2.gromacs0 -> try9-opt1.gromacs0.repack-nonPC -> try10 None of the HIS tags are very convincing, but HIS tags are often disordered, so there is not much point in trying to optimize the prediction of their structure. Model 1 is try10-opt2 which twice had pieces from other optimizations stuck onto the M1-E63 base model. It scores best with our cost functions. Model 2 is try7-opt2, optimized from chimera4 Model 3 is try3-opt2, which uses a different alignment of the strands. Model 4 is try4-opt2, yet another optimization of from chimera3. It is the base model which had N- and C-termini replaced to make chimera4, which was optimized to form our best-scoring model. Model 5 is try2-opt2, which is an optimization of chimera1, based on the same underlying model as try3-opt2 (model 3), but with different HIS tag attached. Wed Mar 21 20:56:24 PDT 2007 Kevin Karplus Our best model is align2, an alignment to 2fvtA, which was better than any of the server models in the "real_cost" function, though the large number of missing atoms makes me suspect that this is an RMSD artefact, and that it actually did rather poorly. Our best complete model was try6-opt1 The best we submitted was model4=try4-opt2, which was only slightly worse.