Tue Jul 8 13:46:29 PDT 2008 TR429 The Make.main file was modified to fetch the refinement model from http://predictioncenter.gc.ucdavis.edu/casp8/target.cgi?target=TR429&view=template I trimmed the a2m file and am starting a new prediction for this subdomain. Make started Tue Jul 8 14:01:17 PDT 2008 Running on moai08.kilokluster.ucsc.edu Tue Jul 8 14:03:33 PDT 2008 Kevin Karplus The CASP organizers also recommend doing subdomains: M1-R100 (actually E22-R100) A101-P176 Tue Jul 8 14:55:17 PDT 2008 Kevin Karplus I looked for the model in the server models and our T0429 models---none came very close, with GDTs of at most 39% to TR429. The closest models to TR429 are by GDT: RAPTOR_TS3 GS-KudlatyPred_TS5 Phragment_TS5 try1-opt3.gromacs0.pdb COMA-M_TS1 by real_cost: Phragment_TS5 RAPTOR_TS3 try1-opt3.gromacs0.repack-nonPC Phyre_de_novo_TS1 It looks like we did not do so well on this target (though we did submit a try1 model as our model 5, which may not be too terrible). I suspect that TR429 does well on the two subdomains, but still doesn't pack them quite right against each other. What the CASP organizers say is REFINEMENT TARGET TR429 (one of the best submitted models). MODEL GDT_TS=47. This is a two-domain protein: D1: 1-100, D2:101-178. You can refine each domain separately (if desired) and then submit the refined domains in one file (as a model for the whole sequence). Refinement of each domain will be evaluated separately. Residues 1-21, 26, 55-72, 155, 177-178 are missing in the experimental structure. Residues 1-21 and 177-178 are cut out from the model. Tue Jul 8 19:02:45 PDT 2008 Kevin Karplus TR429 scores poorly on almost all the measures, including clashes and breaks. I'll probably have to extract constraints from it, though it doesn't seem to have many sheet constraints (unlike the try1 models). Tue Jul 8 19:06:27 PDT 2008 Kevin Karplus I think that TR429 needs sheets, but doesn't quite form them. I should look to see what sheet constraints come closest to what it has, and try enforcing them. Tue Jul 8 21:04:08 PDT 2008 Kevin Karplus I've started separate subdomain predictions. The N-terminal domain for the initial TR432 model is compatible in its sheets with the try1 model, so I'll probably end up using the sheet constraints from try1-opt3 to try to clean up the original model for the N-terminal part. The C-terminal domain is more compatible with the alignment to 2bkdN (align2.sheets). While waiting for the individual domains to finish, I'll do a try2 run to remove clashes and breaks and try to form the sheets better. Tue Jul 8 22:23:29 PDT 2008 Kevin Karplus I may want to do another run, since try2 does not seem to be reducing breaks as fast as I had expected. Thu Jul 10 15:32:03 PDT 2008 Kevin Karplus I made a chimera of try2-opt3 and R100-P176/try1-opt3: N-terminal region from TR429.try2-opt3.pdb I105-P176 from R100-P176/try1-opt3 I'll optimize this as try3, using the same costfcn as for try2. I'll also cut up the TR429 model and optimize the parts in the subdomains. Thu Jul 10 15:40:38 PDT 2008 Kevin Karplus E22-A101/try2 optimization of TR429 started. Currently the try1-opt3 model scores much better, though it has worse breaks, so I may want to try doing an optimization of that as well. The bad breaks are before V75 and I76, very close to the unresolved region 55-72. I might try a chimera that copies 55-76 from TR429 into try1-opt3, and optimize that. Thu Jul 10 15:48:15 PDT 2008 Kevin Karplus R100-P176/try2 optimization of TR429 started. I almost certainly want to optimize the try1 models, since the TR429 model is really awful in the C-terminal domain. Thu Jul 10 15:59:20 PDT 2008 Kevin Karplus E22-A101/try3 optimization started to optimize E22-A101/chimera-try1-init, which is mostly from the E22-A101/try1-opt3 model, but which has K55-I76 from TR429, since the E22-A101/try1-opt3 model had bad breaksin that region (it is not important that we get htis region right, since it is a floppy loop that was not resolved in the crystal, but eliminating the break there will make it easier to optimize the rest of the domain. Thu Jul 10 17:16:00 PDT 2008 Kevin Karplus For the N-terminal domain, we have two main lines: try3-opt3.gromacs0 N-try3 = E22-A101/try3-opt3 If doing a cut-and-past operation, V97 is a good common residue. I may want to copy L71-I76 from try3-opt3 to the N-try3 domain, to move the break into the loop region that we don't care about. I think it is worth making a chimera of the try3 with the N-try3 domain, and optimizing that as try4. Thu Jul 10 17:44:36 PDT 2008 Kevin Karplus I made chimera-N3-try3 E22-A70, Y77-V97 from E22-A101/try3-opt3 L71-I76, R98-P176 from try3-opt3.gromacs0 and am optimizing it as try4. I'll take the C-terminal part of try3-opt3.gromacs0 and try re-optimizing it in R100-P176. Thu Jul 10 17:52:12 PDT 2008 Kevin Karplus R100-P176/try3 will try optimizing the C-terminal domain of try3-opt3.gromacs0. R100-P176/try4 will try optimizing all the models of R100-P176 (I expect it to concentrate on R100-P176/try2-opt3, as that has much smaller breaks than the from-try3 model. Thu Jul 10 19:52:11 PDT 2008 Kevin Karplus R100-P176/try4-opt3 (from R100-P176/try1-opt3) scores better than R100-P176/try3-opt3.gromacs0, due mainly to smaller breaks. The C-terminal domain has two lineages of solutions: try4-opt3 from SAM+undertaker try3-opt3.gromacs0 from TR429 I'll do one more optimization with the N and C domains both coming from the SAM lineages (try5 from chimera-try4-C4). I'll also put together a chimera with both halves coming from the TR429 lineage (chimera-N2-C2). Thu Jul 10 20:42:30 PDT 2008 Kevin Karplus For chimera-N2-C2, I ended up using a little bit of the linker from try4: E22-R100 E22-A101/try2-opt3 A101-L110 try4-opt3.gromacs0 E111-P176 R100-P176/try2-opt3 I'll optimize this as try6. There is a little bit of sheet in the N2 model that might be worth trying to incorporate into the other lineage also---I might want to reoptimize with the try6 costfcn from the try5 models also, when they are done. Or, maybe, I should try doing another run on just the first domain. Thu Jul 10 20:52:39 PDT 2008 Kevin Karplus E22-R100/try4 started to try to get good sheets from both lineages. Fri Jul 11 00:04:15 PDT 2008 Kevin Karplus I should probably patch in the N4 domain to make another model. I also should investigate making L27-V30 into a strand, as predicted. Sun Jul 13 09:27:46 PDT 2008 Kevin Karplus try7 started to optimize chimera-N4-try5. Sun Jul 13 10:03:02 PDT 2008 Kevin Karplus I tried looking for a model (among the server models and SAM+undertaker models) that would pack the two barrels better, but I didn't find one. I don't think that I have the time or the tools to dock the domains against each other. Sun Jul 13 13:08:38 PDT 2008 Kevin Karplus try7-opt3 doesn't score as well as try5-opt3, though the N4.sheets are better. Rosetta likes try7-opt3.gromacs0.repack-nonPC best. It might be worth doing another optimization just from the gromacs-optimized try7 model. Or maybe I should patch N5 into try7 and optimize that. It's probably not worthwhile to patch in N5, as it didn't make the extra strand that I had requested, though it may have improved a couple of the Hbonds. It would take a while to clean up the clashes and breaks in the N5 domain, so I may be better of optimizing try7 without patching in N5 (I could use the sheet constraint for trying to add the extra strand, though). Sun Jul 13 13:25:30 PDT 2008 Kevin Karplus try8 started from all gromacs optimized models (except the try5 ones) to attempt improvements to try7. The added_strand constraint set was added to try to get a better N-terminus. Sun Jul 13 14:40:55 PDT 2008 Kevin Karplus try8-opt3 scores best with the try8 costfcn and try8-opt3.gromacs0.repack-nonPC scores best with the Rosetta energy function. Sun Jul 13 14:58:03 PDT 2008 Kevin Karplus I'll do one more polishing run for try2, the rather awful model polished from TR429, and use the try9 scoring to help choose which models to submit. Sun Jul 13 16:33:46 PDT 2008 Kevin Karplus try9-opt3 improves on try2-opt3.gromacs0, but it will still only be model 5. The big question is whether try5-opt3 or try8-opt3.gromcas0.repack-nonPC should be my number 1 model. I like the helix at A70-D74 in try8 better, and it gets slightly better burial (until gromacs and rosetta mess it up), but I think I'll leave try5 first. The differences are fairly small and probably don't matter much. Sun Jul 13 16:58:20 PDT 2008 Kevin Karplus Submitted with comment: For a REFINEMENT model, the predictions are redone for the sequence included in the refinement model, then both the automatic model and supplied model are further optiized, initially taking sheet and helix constraints from the supplied model to refine. In the case of TR429, the supplied model is rather terrible, with bad breaks and clashes and incompletely formed sheets. The domains were separately predicted, and predictions were pasted into the supplied model. Only model 5 was optimized directly from TR429, though model 4 was constructed from domains separately optimized from TR429. The other models submitted have nothing left of the original TR429 model, except the placement of the two domains, which I believe is incorrect, as the hydrophobic residues I104, F166, I164 are all exposed---I believe they should be packed against the other barrel. I tried looking for a model (among the server models and SAM+undertaker models) that would pack the two barrels better, but I didn't find one. I don't think that I have the time or the tools to dock the domains against each other properly. Model 1 TR429.try5-opt3.pdb # < chimera-try4-C4 chimera-try4-C4: initially TR429.try4-opt3.gromacs0.pdb L110-P176 from R100-P176/try4-opt3.gromacs0 < R100-P176/try1-opt3.gromacs0 < align(2f5kA) try4-opt3 < chimera-N3-try3 chimera-N3-try3: E22-A70, Y77-V97 from E22-A101/try3-opt3 < E22-A101/chimera-try1-init L71-I76, R98-P176 from try3-opt3.gromacs0 try3-opt3 < chimera-try2-C1 chimera-try2-C1: N-terminal region from TR429.try2-opt3.pdb I105-P176 from R100-P176/try1-opt3 < align(2f5kA) try2-opt3 < TR429 (initial model) E22-A101/chimera-try1-init: mostly from E22-S101/try1-opt3 < align(1mhnA) loop K55-I76 from TR429.pdb 2 TR429.try8-opt3.gromacs0.repack-nonPC.pdb # < try7-opt3.gromacs0 < chimera-N4-try5 # best Rosetta energy chimera-N4-try5: E22-V97 from E22-A101/try4-opt3 < try3-opt3 < E22-A101/chimera-try1-init R98-P176 from TR429.try5-opt3.pdb (see model 1) 3 TR429.try4-opt3.gromacs0.pdb # < chimera-N3-try3 4 TR429.try6-opt3.gromacs0.pdb # < chimera-N2-C2 chimera-N2-C2: E22-R100 E22-A101/try2-opt3 < TR429 A101-L110 try4-opt3.gromacs0 E111-P176 R100-P176/try2-opt3 < TR429 5 TR429.try9-opt3.pdb # < try2-opt3.gromacs0 < TR429 Mon Nov 10 10:44:42 PST 2008 Kevin Karplus By GDT, model5 is the best I submitted, though model3 does better by real_cost. GDT is not really improved by the refinement, but other real_cost measures are.