Fri Jun 27 09:11:37 PDT 2008 T0476 Make started Fri Jun 27 09:12:05 PDT 2008 Running on cheep.cse.ucsc.edu Fri Jun 27 09:14:20 PDT 2008 Kevin Karplus This protein looks like it has 2 distinct matches from PDB blast: residues 6-29: 1milA, 1tceA residues 36-105: 1dz4a, 2a1nA, ... I wonder if undertaker will be able to put the parts together. The first part looks too short to be an independent domain. Fri Jun 27 09:27:59 PDT 2008 Kevin Karplus RATS! this is a pyrococcus ORFan. Fri Jun 27 11:19:56 PDT 2008 Kevin Karplus Top hits with HMMs are for c.31.1.5 (1ma3A, 1yc5A, 1q1aA, 1j8fA) with e-values from 3.9 to 7.9. Also high on the list are g.37.1.1 (2drpA, 2ctdA, 1rmdA, 1aliA, 1llmC, 1a1gA, 1a1hA) with E-values from 5.1 to 16. Since this is an ORFan, none of the local-structure or contact-prediction methods are going to be particularly good. It looks like try1 will be based (in part) on 2gx9A, which is d.299.1.1. I may want to start runs which use only the c.31.* or only the g.37.* templates. Fri Jun 27 14:27:49 PDT 2008 Kevin Karplus try1-opt3 doesn't look so hot to me. good features: the three helices are predicted. disulfide bonds are formed for the 4 cys residues: C47-C4 C50-C7 bad features: hydrophobic residues exposed. No hbonds for strand. May want to form strand-helix-strand for G51-K88 alignment 5 has bit of hairpin that may be worth saving: SheetConstraint F45 K46 F45 E53 hbond F45 Tue Jul 1 15:58:02 PDT 2008 SAM-T08-MQAO hand QA T0476 Submitted Tue Jul 1 15:58:02 PDT 2008 SAM-T08-MQAU hand QA T0476 Submitted Tue Jul 1 15:58:02 PDT 2008 SAM-T08-MQAC hand QA T0476 Submitted Wed Jul 2 11:08:36 PDT 2008 Kevin Karplus I've started the metaserver predictions. Question: should the CYS residues form disulfides or should they coordinate a metal ion? The pairs are so close that zinc-binding is a distinct possibility. This is a hyperthermophile protein, though, so disulfides can't be ruled out. The MQAC quality assessment favors Zang_Server, SAM-T08-server, and BAKER-ROBETTA. The MQAU assessment favors SAM-T08-server_TS2, Zhang-Server, BAKER-ROBETTA, and BioSerf. I'm a little worried that the meta-server runs will just pick up the SAM-T08-server stuff, and not give me more variety to consider. Wed Jul 2 14:49:02 PDT 2008 Kevin Karplus MQAU1 and MQAC1 are both based on BAKER-ROBETTA_TS5, and both score better than try1-opt3 with the try1 costfcn. The models are almost identical, but MQAC1 scores slightly better. The cys residues are made into diulfides, but stupidly (C4-C7, C47-C50). The Robetta model has this clustering, but doesn't quite for the ssbonds---nor does it arrange the CYS for metal binding. I like the robetta model but feel a need to decide what to do with the CYS residues. Perhaps I should feed the BAKER-ROBETTA_TS5 model to VAST and see where it comes from. Your VAST Search job was submitted at 07/02/2008 18:00:09(EDT). Request ID: 535085030269508703 Wed Jul 2 15:13:33 PDT 2008 Kevin Karplus The top hits are 1pprM, 3bf8A, 2du3A, 1d4uA, 3bf7A, 3crvA, 1zyzA, 2jg0A, 2grcA, 1katA, but the cys are not aligned in any of these. I should probably make two lineages of models: one with disulfides C7-C47, C4-C50, the other with a zinc binding site. Wed Jul 2 15:19:35 PDT 2008 Kevin Karplus try2 started from existing models with the zinc constraints. Wed Jul 2 20:06:32 PDT 2008 Kevin Karplus I had also run try3 with SSBond C4 C50 SSBond C7 C47 but neither try2 nor try3 looks like a convincing arrangement of cys residues, so I'll also do try4 with SSBond C4 C47 SSBond C7 C50 and see if it does any better. Thu Jul 3 04:48:47 PDT 2008 Kevin Karplus gromacs crashed on try4 (don't know why). Thu Jul 3 04:55:22 PDT 2008 Kevin Karplus I'm going to try again for the metal-binding site, since I'm not particularly liking the disulfide attempts. I'll also up the weight for breaks. Thu Jul 3 11:16:45 PDT 2008 Kevin Karplus try5 almost forms a zinc-binding site, but there isn't quite room for it, sandwiched in between the conserved W13 and W102 residues. Thu Jul 3 11:59:21 PDT 2008 Kevin Karplus I'll score all the server models with the try5 costfcn, to see if anything comes closer to what I want. If not, I could also try building models from scratch with the try5 costfcn (turning up the sheet constraints). Thu Jul 3 12:04:47 PDT 2008 Kevin Karplus SAM-T08-server_TS4 has the zinc-binding site and helix between them, but not much else. HHpred5_TS1 has a similar zinc-binding site, but a more continuous chain---the rest looks like junk. pro-sp3-TASSER_TS2 has the zinc-binding site and looks like it may be getting close to having V44-V49 antiparallel to L89-K83. The cys are likely to interfere with the sheet, though. I put together a try6 costfcn which has the zinc binding site, but does not look for the sheets and helices of the BAKER-ROBETTA_TS5 model. The servers that score best with try6 are BAKER-ROBETTA_TS5, RBO-Proteus_TS1, RBO_PRoteus_TS3, fais-server_TS2, BAKER-ROBETTA_TS2, RBO-PRoteus_TS4, ... I'll try a metaserver run with the try6 costfcn, and also a run from alignments. Thu Jul 3 13:02:56 PDT 2008 Kevin Karplus The MQAX6 metaserver run is just polishing up BAKER-ROBETTA_TS5, which we've already been working with. I could try excluding it from a metaserver run: that will be MQAY6. The try6 run has a nice zinc-binding site, but really horrendous, unclosable breaks (at least for try6-opt2). I'll try again with try7, turning up breaks a fair amount, but not clashes. Thu Jul 3 14:34:44 PDT 2008 Kevin Karplus The MQAY6 run started looking at fais-server_TS5, but quickly switched to BAKER-ROBETTA_TS2, but the model is not at all compact. The try7 run did get a good zinc site, and the breaks are not too bad, but the burial is rather crummy---this is not a compact arrangement of the helices. The MQAX6 model is looking pretty good, and may be improvable with little more break and clash removal. Thu Jul 3 15:36:58 PDT 2008 Kevin Karplus The MQAY6 run has just finished, and MQAY6-opt3 did not come out very compact. Tue Jul 22 16:01:22 PDT 2008 Kevin Karplus I'm starting try8 as polishing run for MQAX6, to try to get rid of the bad clashes around the zinc site. Tue Jul 22 19:45:39 PDT 2008 Kevin Karplus Clashes are still bad. I'll have to turn up soft_clashes and turn down the zinc constraints to get some movement. Polishing run with the try9 costfcn started for MQAX6 and descendents/ Tue Jul 22 19:54:08 PDT 2008 Kevin Karplus I also started try10 with almost the same costfcn (one helix constraint is shorter by 1 residue), to optimize try2/try5. Wed Jul 23 15:05:34 PDT 2008 Josue Samayoa I want to change the try 9 costfunction so as to lower soft clashes without losing the zinc site. I think the try9 runs look good in general but we may have lowered the weight on zinc too much. I copied the try9.under and try9.costfcn files and renamed them try11. I did a global replace of try9 with try11 and changed the zinc weight to 90 (from 30) and the soft clash weight to 550 (from 450). Wed Jul 23 15:54:22 PDT 2008 Josue Samayoa Running try11 on lopez Wed Jul 23 20:38:18 PDT 2008 Firas Khatib so try11 is done and doesn't score as well as try2, try5, try9 or try10 using it's own costfcn. It does score the best in terms of the zinc cost, however, but does much worse with the MQAX6.sheets constraintSet. Looking at the superposition of the recent runs it seems that: try11-opt3 is the same as T0476.MQAY6-opt3.pdb, the cysteines have the exact same rotamer positions and the models have identical zinc costs. The backbone trace of both chains are very similar. so I think it might be worth doing a similar to try11 run, but only giving it the try9/try10 models (that were based on Robetta_TS5) in case it picks a model based on Robetta_TS2 again. I'll run try12 with try10,try9,try8, and MQAX6 as inputs, but using Josues' try11 cost function try12 is running on shaw Thu Jul 24 02:21:08 PDT 2008 Firas Khatib well, try12-opt3 scored best using try11's costfcn, but mainly because it had less breaks and clashes than try10-opt3. Sadly, the zinc cost is the same and not as good as try11. Fri Jul 25 04:53:06 PDT 2008 Kevin Karplus There see to be really only 2 models (based on BAKER-ROBETTA_TS5 and BAKER-ROBETTA_TS2) with only minor tweaks. I made a try13.costfcn that has no helix or sheet constraints. It prefers try12-opt3 (from BAKER-ROBETTA_TS5). I'll try doing an MQAY13 metaserver run that excludes just the two BAKER-ROBETTA models used so far, and see if anything else comes up. (Probably I'll get a Pcons_dot_net copy of a robetta model, but maybe they didn't do that this time.) Fri Jul 25 11:18:44 PDT 2008 Kevin Karplus MQAY13 is based on RBO-Proteus_TS1, and scores quite well with the try13 costfcn. It has a decent zinc site, but is not as good with the helices as our other models. I'll put it in as our 5th model, after 2 robetta2 and 2 robetta5 models. Fri Jul 25 17:23:46 PDT 2008 Kevin Karplus Submitting with comment: This Pyrococcus target is an ORFan (only almost identical sequences from Pyrococcus and Thermococcus species found). That means that the HMMs and neural nets are not nearly as effective as usual in predicting remote homologs and local structure. We ended up with no decent models from SAM+undertaker, so were reduced to using only metaserver predictions. We expected the four CYS residues to cluster, mostly likely as a zinc-binding site, though with a hyperthermophile protein, a pair of disulfides is possible. Searching the server models for possible zinc-binding sites came up with three potential models, which were further optimized to improve the binding site and packing. Firas Khatib and Josue Samayoa assisted on predicting the structure for this protein. Model 1 T0476.try12-opt3.pdb # < T0476.try8-opt3 < BAKER-ROBETTA_TS5 #best scoring with try12.costfcn 2 T0476.try10-opt3.gromacs0.repack-nonPC.pdb # < T0476.try5-opt3 < try2-opt3 < MQAU1-opt3.gromacs0 < BAKER-ROBETTA_TS5 # best Rosetta energy 3 T0476.try11-opt3.pdb # < T0476.MQAY6-opt3 < BAKER-ROBETTA_TS2 4 T0476.MQAY6-opt3.pdb # < BAKER-ROBETTA_TS2 5 T0476.MQAY13-opt3.pdb # < RBO-Proteus_TS1