Tue May 6 09:25:09 PDT 2008 T0389 Make started Tue May 6 09:25:41 PDT 2008 Running on cheep.cse.ucsc.edu Tue May 6 09:38:12 PDT 2008 Kevin Karplus We see to have gotten yet another hand-prediction that has high homology---so far all the targets (server-only or human/server) have been high-homology targets. At least this one is only 33% identity over 113 residues (or 38% over 78 or 26% over 142), so there is some modeling to do. Tue May 6 10:42:10 PDT 2008 Kevin Karplus The top hits from the HMMs are not necessarily exactly the same as from the BLAST run, but there is pretty good consensus on superfamily c.46.1 (though not on the family). Tue May 6 13:03:22 PDT 2008 Kevin Karplus This superfamily is "Rhodanese/Cell cycle control phosphatase", which are either sulfurtransferases or phosphatases. We may be able to tell which we have by looking for active site residues. The first hit, 2oucA, is a MAP kinase binding domain of MKP5. The next two hits, 1t3kA and 1qb0A, are cell cycle phosphatases. Then comes 1gmxA, a single-domain sulfurtransferase. The comes 2j6pA, an Arsenate-Antimonate Reductase. Then 1yt8A, a multidomain sulfurtransferase. The multiple alignments are in pretty good agreement about which residues are conserved, and the closest hits are to the phosphatases, rather than the sulfurtransferases. Tue May 6 15:03:35 PDT 2008 Kevin Karplus There seems to be agreement between the HMMs and BLAST that 2oucA is the finest template, though different other templates come up as next best. Tue May 6 17:42:29 PDT 2008 Kevin Karplus There are pairs of CYS residues near each other in try1-opt3: C142-C77 C98-C46 C127-C132 None of these are conserved, and there is no reason to believe that there are disulfides, but this is an interesting coincidence. In try2-opt3, the pairing is different C98-C46 C142-C132 There is a possibility of CYS-HIS clustering, but not of disulfides, so I'll turn off maybe_ssbond. Tue May 6 21:28:17 PDT 2008 Kevin Karplus The try3 costfcn scores try2-opt2.gromacs0 better than any of the try3 models. Maybe I should try polishing what I have, then wait for the server tarball to be made and see what I can pick out of that. Wed May 7 08:22:06 PDT 2008 Kevin Karplus try4-opt3 still has very bad breaks. I'll do a similar polishing run, but with the break cost turned up. Also, try4-opt3 has made C46-C98 into an ssbond, which may account somewhat for the breaks before K48. I'll turn off maybe_metal, so that there is no bonus for clustering cys. Wed May 7 10:06:59 PDT 2008 Kevin Karplus try5-opt3 improves slightly on try4-opt3, but clashes are just as bad and the breaks only slightly reduced. The protocol for try5, using initially reduced clash and break costs, meant that try2-opt3.gromacs0 never got really involved in the optimization. Perhaps I can up the soft-clashes and breaks until it is as good as try5-opt3 and do optimization without the scaling. Thu May 8 13:35:38 PDT 2008 Kevin Karplus try6-opt3 has considerably reduced both breaks and clashes, but rosetta prefers try3-opt3.gromacs0.repack-nonPC and try5-opt3.gromacs0.repack-nonPC. For try7, I should drop pred_nb11_back a bit and increase soft_clashes and breaks. Thu May 8 17:44:00 PDT 2008 Kevin Karplus try7-opt3 does not look bad. The crystallographers report that this is a dimer. It looks like the dimerization interfaces is around F99, L103, V102, M50, with K106 rather in the way. I should probably look for dimeric templates. Sat May 10 08:35:11 PDT 2008 Kevin Karplus I downloaded the server tarball, and the MQA_init costfcn likes try5-opt3 best of all the models. Of the non-SAM models, BioSerf_TS1, nFOLD3_S4, and MUProt_TS1 come out on top. Perhaps I should do an optimization from just the server tarball. Mon May 12 14:12:13 PDT 2008 Kevin Karplus MQA_init was, of course, not the right costfcn to use for evaluating polished models from the server. The MQAC (consensus) costfunction prefers the Zhang-Server models (probably because all the meta servers like them). The MQAU costfcn prefers nFOLD3_TS4 and SAM-T08-server_TS1. The MQA measures are predicting GDT of 60-80%. I have started optimization runs from the 10 top models of both MQAU and MQAC. Mon May 12 17:17:55 PDT 2008 Kevin Karplus The top 5 models with the try8 costfcn are T0389.try8-opt3.pdb.gz T0389.try7-opt3.pdb.gz T0389.MQAU1-opt3.pdb T0389.try8-opt3.repack-nonP... T0389.MQAC1-opt3.pdb Mon May 12 17:25:41 PDT 2008 Kevin Karplus The different models agree moderately well on the core (T4-F124), but not at all on the C-terminus. Of course, the last 6-8 residues are just a HIS tag, but there are still about 20 residues not being structured. We predict helix out to about C127, but after that we have weak predictions of coil conformation. Thu May 15 10:24:28 PDT 2008: SAM-T08-MQAO hand QA T0389 Submitted Thu May 15 10:24:28 PDT 2008: SAM-T08-MQAU hand QA T0389 Submitted Thu May 15 10:24:28 PDT 2008: SAM-T08-MQAC hand QA T0389 Submitted Sat May 24 12:42:06 PDT 2008 Kevin Karplus I tried rescoring try8 with the clash definitions changed to ReadClashTable exp2-pdb-cullpdb_pc80_res1.2_R0.2_d070810_chains408-2symm.clash SetClashDefinition exp2-pdb-cullpdb_pc80_res1.2_R0.2_d070810_chains408-2symm which raises the clash costs and makes the gromacs models look a little better (since they have rather aggressive clash removal). Note: the try8 model is based on BAKER-ROBETTA_TS2, so only models try7 and earlier are free of meta-server influence. Sat May 24 16:51:52 PDT 2008 Kevin Karplus try9-opt3 is based on try8-opt3, in turn from BAKER-ROBETTA_TS2 It has few clashes and no breaks. The clash detector is slightly different from gromacs's, since the gromacs reoptimization is reported as having more clashes than try9-opt3. Rosetta's favorite so far is try9-opt3.gromacs0.repack-nonPC. try8 and try9 differ from BAKER-ROBETTA_TS2 after S136, a region that is likely to be disordered. try8-opt3 and try9-opt3 are almost identical, and try9-opt3.gromacs0.repack-nonPC is only slightly different. I'll go with the one rosetta like best. One thing I like from the initial alignment, that seems to be missing in the try9-opt3 model is SheetConstraint (T0389)T35 (T0389)L39 (T0389)K135 (T0389)L131 hbond (T0389)H37 I'll try adding this in try10, though I expect that moving the strand in such a tightly packed model will be difficult. Sat May 24 20:17:47 PDT 2008 Kevin Karplus I submitted 1 T0389.try10-opt3.pdb a metaserver prediction based on BAKER-ROBETTA_TS2 2 T0389.try7-opt3.pdb a native undertaker/SAM prediction 3 T0389.MQAC1-opt3.pdb a metaserver prediction based on top consensus models 4 T0389.MQAU1-opt3.pdb a metaserver prediction based on top models using alignment constraints and undertaker cost functions 5 T0389-2oucA-t06-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m our favorite alignment method applied to the lowest-evalue template Tue Jun 10 14:35:09 PDT 2008 Kevin Karplus The TR389.pdb model was given to us as one of hte best server models, to use as the basis for refinement. TR389 does not score particularly well with the try10 costfcn, so I may need to come up with a different costfcn (perhaps extracting sheets and helices from TR389). I wrote a short undertaker script find-TR389.under to try to find out what server the model comes from. Actually, it probably doesn't come from a server, but from a human predictor (probably not us). Note: TR389 ends with residue K135, so the last 18 residues (which I was having trouble modeling) are not part of the structure to refine! The best real_cost is for LEE-SERVER_TS1-scwrl, with a GDT of 94.4% Our best model was T0389.try10-opt1-scwrl.pdb with a GDT of 85.7% to TR389 (27th/710 on real_cost). Our next best was try10-opt3.repack-nonPC with a GDT of 85.7% also. SAM-T06-server_TS1 got only 73.3% SAM-T02-server_AL1.pdb-scwrl got 73.1% SAM-T08-server_TS3 got 69.8% SAM-T08-server_TS1-scwrl got 53.3% Wow, it looks like our servers really botched this model, but I fixed it up by hand. The MQAC MQA function liked the LEE-SERVER_TS1 model (rank 17) MQAU ranked it 53. MQAO ranked it 82. Aside from the loop and helix for L64-S78, the TR389 model agrees pretty closely with our top model. To do refinement, I'll have to start a new directory, since the sequence is shorter.