Fri Jul 18 13:44:08 PDT 2008 TR461 Fri Jul 18 13:47:33 PDT 2008 Kevin Karplus The CASP organizers say REFINEMENT TARGET TR461 (the best submitted model according to GDT_TS). Quite accurate model: GDT_TS=88; RMSD=1.22A. Residues 1-15 and 178-189 are missing in the experimental structure. We suggest refinement of the model for residues 20-176. Make started Fri Jul 18 13:49:56 PDT 2008 Running on moai06.kilokluster.ucsc.edu Fri Jul 18 14:30:32 PDT 2008 Kevin Karplus T0461 was a server-only model, so I have no prior knowledge of it. MQAU quality assessment predicted Pcons_multi_TS2 with GDT of 77.3% MQAC predicted Zhang-Server (4,1,5,2) with GDT of 79% With 32 residues out of 189 eliminated, the GDT should go up quite a bit. I wonder if their 88% is for 16-177 or 20-177? 2nx8A seems like an obvious template, with 33% id and a BLAST e-value of 1e-23. 1wwrA and 1z3aA should also be good templates. Fri Jul 18 20:06:20 PDT 2008 Kevin Karplus TR461 is very close to BAKER-ROBETTA_TS4 (GDT=93.2%, rmsd=1.96), and not too far from SAM-T08-server_TS1 (GDT=91.9%, rmsd=2.48) or try1-opt3.gromacs0.repack-nonPC (GDT=91.2%, rmsd=2.24) Fri Jul 18 20:33:49 PDT 2008 Kevin Karplus For try2, I'll optimize just the provided model, but with sheet and helix constraints taken from it and from align1 (2nx8A) and align2(2b3jA). The sheet constraints for align1 and align2 are both compatible with the TR432 model, but may not be quite compatible with each other. Actually the models from the first 4 alignments are all almost identical in the core region, with only little movements of the loops. The loop R89-V98 needs work, as there is an insertion relative to the templates. S146-P155 also seems to be variable. I think that the provided model is wrong for P160-A164, which has a conserved structure in all the templates. try3 will use the same costfcn as try2, but work from alignments. Sat Jul 19 10:10:57 PDT 2008 Kevin Karplus try2-opt3 has the best score with the try2/try3 costfcn, but try3-opt3.gromacs0.repack-nonPC scores best with rosetta. Sat Jul 19 10:15:16 PDT 2008 Kevin Karplus I don't know what I was thinking last night---P160-A164 is the usual structure. Only the alignment to 1wwrA is different there, not the provided model. try2 and try3 differ on A147-R154 and S91-E97 C111.SG, C114.SG, E77.OE1, and H75.(ND1 or NE2) look like they ought to form a metal-binding site. Sure enough, there is zinc there in 2nx8A. TR453 2nx8A C111 C95 C114 C98 H75 H65 E77 E67 Constraints C111.SG C114.SG 4.09 C111.SG H75.ND1 3.70 C111.SG E77.CD 6.26 C114.SG H75.ND1 5.01 C114.SG E77.CD 3.51 H75.ND1 E77.CD 5.41 There is also a sulfate on the other side of the HIS. H75 H65 K70 R56 Constraint H75.NE2 K70.NZ 6.77 C138 and C158 are unusually close, but there do not seem to be any charged residues in the neighborhood for metal binding. There are no corresponding CYS pairs in 2nx8A. Sat Jul 19 13:09:29 PDT 2008 Kevin Karplus C138 and C158 are not conserved in the alignments (unlike C111, C114, H75, and E77). Even C50 is better conserved, though the corresonding C40 in 2nx8A doesn't seem to have any obvious special role. T107 is conserved, and the corresponding T91 appears to make a hydrogen bond to a tightly bound water molecule That water molecule in turn makes Hbonds to two backbone oxygens. T107 T91 P47 P37 This indirect Hbonding may not be conserved in TR461---at least I don't see an obvious correspondence of residues to be the other Hbonded atoms. Sat Jul 19 13:38:21 PDT 2008 Kevin Karplus I'm starting try4 from the existing models, with extra constraints for the zinc site and the sulfate site. Sun Jul 20 17:11:11 PDT 2008 Kevin Karplus try4-opt3 really improves the zinc site, though gromacs throws some of that away (perhaps try4 introduced some clashes, though the soft_clashes score is quite small). More likely is that try4 picked up a bad bond length or bond angle somewhere. I'll try again, starting from the try4 gromacs models. Sun Jul 20 17:24:42 PDT 2008 Kevin Karplus try5 is running from the gromacs-optimized models, to try to insert the zinc site in them. The regions that have high variability are R89-V98 and A147-F156, nowhere near the zinc site. I wonder if I should try making some chimeras, or if I should just try making more models from alignments, to find better loops in either of these two regions. Sun Jul 20 22:37:08 PDT 2008 Kevin Karplus I should look at the different choices for F20. And I should do more searching for variant loops for 2nx8A alignments. try6 is such a search, holding onto the try2 sheets and helices, and the zinc-binding site. Clashes and breaks are not as heavily weighted as in try5. Mon Jul 21 06:25:50 PDT 2008 Kevin Karplus try6 still has breaks in both the insert loops, so if I want to generate more loop possibilities, I'll have to do it with bigger break costs. I'll try again as try7. Also in try7, I won't run the initial model from alignment through SCWRL, so that I can measure the zinc constraints a little more directly, and make sure those atoms don't move. For try8, I'll polish try6, using the try5 costfcn (without the "sulfate" cost). Mon Jul 21 08:55:59 PDT 2008 Kevin Karplus Something is wrong here, as try8 scored worse than try6. Oops--it was somehow trying to read non-existent try8 files as initial models, rather than try6 models. Although try8 picked up much of the core alignment, I don't care for the loops at all. Let me redo try8 as try9. try1,3,6,7 all provide alternative loops for the two insert regions. try9 is polishing try6. If that works ok, then I should try polishing 1,3, and 7. In try7-init (which had no SCWRL done on the conserved residues), the zinc site has C111.SG C114.SG 4.08 C111.SG E77.OE1 5.57 C111.SG E77.OE2 6.43 C111.SG H75.ND1 3.70 C114.SG E77.OE1 3.63 C114.SG E77.OE2 3.59 C114.SG H75.ND1 3.81 E77.OE1 H75.ND1 5.30 E77.OE2 H75.ND1 4.83 It looks like I've had the C114.SG-H75.ND1 distance wrong in the zinc constraints I've been using. I'll need to reoptimize each of the models I plan to submit with this better set of zinc constraints. Mon Jul 21 12:56:25 PDT 2008 Kevin Karplus try10.costfcn is the try9 costfcn with the zinc constraint fixed. It scores try9-opt3 best, but in part that is because the zinc constraint is too heavily weighted (weight=50). I'll scale it back and rerun. Mon Jul 21 14:56:01 PDT 2008 Kevin Karplus With the zinc scaled down, try10 prefers try5-opt3, try4-opt, try9-opt3, ... which matches my opinion better. Now I should optimize try5 try9 try3 try7 try1 with the try10 costfcn. Mon Jul 21 15:03:25 PDT 2008 Kevin Karplus try10 polish try5 try11 polish try9 try12 polish try3 try13 polish try7 try14 polish try1 Mon Jul 21 17:14:15 PDT 2008 Kevin Karplus try10-opt3 no breaks, but gromacs breaks the chain before K122 try11-opt3 TR461.try11-opt3.pdb.gz breaks before (TR461)P155 with cost 3.47432 try12-opt3 no breaks, but gromacs breaks the chain before N71, E97, H75, K70 try13-opt3 breaks before E100, S22, F156, P150 try14-opt3 TR461.try14-opt3.pdb.gz breaks before (TR461)F99 with cost 2.32327 Perhaps I could do some mix-and-match loops to fix the loop around P155 in try11, or the loop around F99 in try14. I could also try clash and break reduction starting from gromacs-optimized models. Sat Aug 2 08:28:55 PDT 2008 Kevin Karplus try15 will try to polish the gromacs-optimized versions of try5 and try10. try16: try11 and try9 try17: try12 and try3 try18: try13 and try7 try19: try14 and try1 Sat Aug 2 10:52:20 PDT 2008 Kevin Karplus I made a how-different undertaker script and Makefile target, to see how far the models are from the TR461 initial model, which is supposed to be GDT 88% and RMSD_CA 1.22 Angstroms. Some of the models (particularly the try6/9/11/16 series) seem to be a bit too far from the initial model. The try7-try13-try18 series may also be a bit too far. Perhaps I should make chimeras of try15 with try18 and try16, to pick up just one of the loops from them. try16 has bad breaks for P155 and F156, so that loop shouldn't be used. try18 has a bad break for S22 and a minor one for E100. The loops to copy are approximately A145-I159 from try18 W87-H101 from try16 Sat Aug 2 12:24:24 PDT 2008 Kevin Karplus try21-opt3 optimizing chimera-try15-try18 scores well with the try20/try21 costfcn try20-opt3 scores almost as well. The residual clashes are slightly larger than in try15, but the dry packing numbers are better. The same can be said of try20, though it doesn't score quite as well as try21. If I go back to the try15...try19 costfcn, try21 comes out just after try15 and try10, with try20 not far behind. Perhaps I should make chimeras of try15 with each of the other loops we've been investigating: try19 has breaks before F99, P155, and V141, so don't use early loop and be cautious with later loop. copy I144-P155 try17 has breaks before E97 and G136, so don't use early loop copy I144-P155 I might also want to make a chimera of whatever early loop comes out best and whatever late loop comes out best. I should also polish each of the models I'm considering with a high-clash-weight costfcn, to remove the small residual clashes. Sat Aug 2 12:46:27 PDT 2008 Kevin Karplus try22 started to optimize chimera-try15-try19 try23 started to optimize chimera-try15-try17 I should make chimeras try20-try21, try20-try19, and try20-try17 to explore the other decent early loop paired with the other late loops. In each case I'll copy I144 to P155. Sat Aug 2 13:04:20 PDT 2008 Kevin Karplus try24 started to optimize chimera-try20-try21 try25 started to optimize chimera-try20-try19 try26 started to optimize chimera-try20-try17 All are using the same costfcn as try20..23. Sat Aug 2 15:08:15 PDT 2008 Kevin Karplus With the try20..try26 costfcn, the top scorers are origin of Loop1 Loop2 try22 try15