Wed Jun 7 09:28:24 PDT 2006 T0321 Make started Wed Jun 7 09:29:48 PDT 2006 Running on orcas.cse.ucsc.edu Wed Jun 7 16:47:27 PDT 2006 Kevin Karplus No good hits---ab initio or difficult fold recognition. Conservation signals are strong in all three alignments, but not quite identical. T06 and t04 look the same in sequence profile, but t2k has a few different conserved residues. Top alignments are similar, but all a bit fragmentary. This is probably and alpha/beta fold with the alpha helices on both sides of a beta sheet. I'm hoping that the sheet constraints from the alignments will help in assembling the fragmentary alignments. Wed Jun 7 21:05:28 PDT 2006 Kevin Karplus We have a pretty good model after about V105 (or maybe K123). Before V105 is pretty much junk though. Also, the two C-terminal strands probably belong with the rest of the sheet. We should do subdomains. I've started M1-K123 and V105-K251 on the farm cluster. The model that we have seems to come mainly from 1vpdA, the top-scoring model with the HMMs (though with E-value 0.05). Mon Jun 25 15:29 PDT 2006 Zack Sanborn I'm taking over this protein. A soft deadline is this Wednesday. Kevin's first thought was to break this protein up into subdomains around the following amino acids: Lys103 Gly122. And, it looks like he already did this with the subdomains: domain 1 = M1-K123 and domain 2 = V105-K251. So, that's pretty cool. I'll take a look at these subdomains and see if we can piece these together somehow. Mon Jun 25 17:00 PDT 2006 Zack Sanborn I've started two unconstrained optimization runs on the two subdomains (both called try2). I've also started an optimization run on a chimera of the two subdomains, that I made by superimposing the two subdomain PDBs on the try1-opt2 model and the copying-and-pasting the two subdomains together. (not sure if this would still be called a "chimera" in this case). Start try2 for M1-K123 on orcas, and try2 for V105-K251 on camano. Unfortunately, the superposition didn't align the chains together all too well (a sizable gap). I started another optimization run to hopefully get these domains to stick together. Not sure if this is the right way of doing this, but I didn't see any other way. Started try2 for the chimera on camano. Mon Jun 25 18:04 PDT 2006 Zack Sanborn Whoops, I'm using the sheet and helix constraints from the alignment, not from the models. I would stop the job (try2 at /T0321), but there I've started two jobs on camano, and both appear identical (because both are try2 jobs, one on a subdomain and one of the full chimera). So, I've made a try3 that is what try2 should have been and am running it on lopez. I've added the sheet constraints from both subdomains (both helix and sheet constraints). So, if try2 totally screws up, hopefully this will produce a better model. Mon Jun 25 21:53 PDT 2006 Zack Sanborn The try2's for the subdomains are done and have improved the structure. Also, the chimera I made was optimized using the original cost function (try2) and a run with the helix and sheet constraints from their respective subdomain models. try2-opt2 is the highest scoring model thus far. I've started two more runs. try4 starts with a new chimera, made the same way as described above for try2, but using the newly optimized subdomain models. try5 takes the helix and sheet constraints from these models and essentially starts over to see if it comes up with a different or better structure. After these try's finish, I'd like to do a polishing run starting from all models. Tue Jun 27 07:59:58 PDT 2006 Kevin Karplus try4 is the best-scoring with the try4=try5 and unconstrained costfcns, and is just behind try2 on the try1 costfcn. I made a preliminary submission: Model 1 is try4-opt2, optimized from a chimera of try2 runs on the two subdomains (not sure where the crossover point was taken). Model 2 is try2-opt2, an optimization from a chimera of try1 runs on the two subdomains. Model 3 is try1-opt2, the fully automatic run. Model 4 is just sidechain replacement by SCWRL on an alignment to 1vpdA. Model 5 is just sidechain replacement by SCWRL on an alignment to 1vmeA. The README file really needs more notes on chimeras---exactly where was the crossover point? (At least this README told me which models were used for making the chimeras.) Tue Jun 27 08:19:30 PDT 2006 Kevin Karplus I think that I like the hairpin at the end of the model, but it is getting lost in the most recent runs. Perhaps we could restore it? Tue Jun 27 11:20 PDT 2006 Zack Sanborn Sorry Kevin, the crossover point was at Lys123 for both chimeras. Next time, I'll be more explicit. I'll take a look at trying to get the hairpin you want back in the models tomorrow. Wed Jun 28 12:36:53 PDT 2006 Zack Sanborn I'm attempting to get that hairpin back. It was in the try2-opt2 model, but in neither of the chimeras or subdomain structures. First, I've taken all of the helix and sheet constraints from try2-opt2 and have restarted a structure from the alignments. Since try2-opt2 was started from the chimera, I'm hoping that by restarting from the alignments but using the "good" constraints will come up with a better structure. This run is called try6 and was started on vashon. Since try2-opt2 is also one of the best scoring models, I'm doing a polishing run that starts from all models. However, we want to maintain structures that have the hairpin also because the cost function has the sheet constraint for the hairpin, but aside from that is unconstrained. I'm hoping this will get the best features of all the models while keeping the hairpin. This run is called try7 and was started on camano. Mon Jul 3 13:56:13 PDT 2006 Zack Sanborn try7 was successful and became the highest scoring model and kept the beta hairpin we wanted. A polishing run, try8, using an unconstrained cost function was started. This improved the score of the model a bit. Right now, I'm not sure what is left to do on this structure. After getting an email from Kevin about doing polishing runs from GROMACS optimized structures (which are screwed up enough to possibly get models out of a local minimum), I made a try9 that will do just this. However, I'm not sure if it'll help or just completely screw up the structure. We'll see, I guess. try9 was started (using an unconstrained costfcn) on orcas. It is starting from try7-opt2 and try8-opt2 GROMACS optimized structures only. Mon Jul 3 14:39:50 PDT 2006 Zack Sanborn I was looking at how we got to try8-opt2 and found (through the try*.log's): try8-opt2 > try7-opt2 > try2-opt2 > try2-chimera Mon Jul 3 14:59:57 PDT 2006 Zack Sanborn Apparently, doing the optimization run starting from GROMACS optimized structures was the right thing to do... however, I needed to increase the weights for soft_clashes (from 20 to 50) and breaks (from 50 to 200). So, I've started a try10 that increases these costs, but starts from the try7-opt2 and try8-opt2 GROMACS structures like try9. I decided to keep try9 running because it may be interesting to see the effect that the different costs have on the structure. Wed Jul 5 13:58:06 PDT 2006 Zack Sanborn The new optimization runs (try9 and try10) have finished. They are the top scoring models mostly due to the fact that their breaks have been significantly reduced. The structures look good but are a little "foamy". So, I'm starting a new run, using all structures (but will likely pick try9-opt2 or try10-opt2) that increases the costs for "phobic_fit" and dry5 weights. Hopefully this will help pack the protein a little better. But, starting from the GROMACS structures when you have a lot of breaks in the Undertaker structures does work in producing a better structure. Wed Jul 5 18:06:05 PDT 2006 Zack Sanborn It does appear that this run did help with the structure's packing. Currently, using the try11 costfcn, try11-opt2 is the best scoring model. try11 is based off of try9-opt2, which is no surprise considering try9-otp2 was the best scoring model up to that point. I just started a new optimization run that is unconstrained. This is a polishing run that will also allow the structure to expand if the penalties I put on it for try11 were too constrictive. We'll see. I started try12 on orcas. Thu Jul 6 16:17:42 PDT 2006 Zack Sanborn Well, try12 chose try8-opt2 as the model to optimize, not try11. This means that Undertaker prefers the try8 model as opposed to the try11 model that was packed a little better. Looking at score-all.try12.pretty, we see that try11-opt2 is the fourth best scoring model behind try12-opt2, try8-opt2, and try12-opt1. They differ by little in overall score. The big differences are in phobic_fit (which try11-opt2 does better in) and in side_chain (which try12 and try8 do signficantly better in). Not terribly sure what to do next, but I'm going to update the best-models.pdb file to see how similar the top scoring models are. I put the following models in superimpose-best.under ReadConformPDB T0321.try12-opt2.pdb (model 1) ReadConformPDB T0321.try8-opt2.pdb ( " 2) ReadConformPDB T0321.try11-opt2.pdb ( " 3) ReadConformPDB T0321.try9-opt2.pdb ( " 4) ReadConformPDB T0321.try4-opt2.pdb ( " 5) As expected, all models strongly agree on the first domain of the protein. For the other domain, models 1, 2, and 4 were very similar to one another. This makes sense since try12 and try9 were both based on try8. The try11 structure is significantly different in the second domain, which is a good thing since we don't want to submit the same structure 5 times. The try4 model is also different from the other models but share many of the characteristics of the try12, try9, and try8 models. However, it has a long helix where there exists a nice beta-hairpin in the other models. The try4 model is the lowest scoring model of the group, but not by much. I think I will try a polishing run from only try11-opt2 to see if that structure can get any better. try13, a polishing run for try11-opt2, was started on orcas. Fri Jul 7 15:30:07 PDT 2006 Zack Sanborn Well, try13 is the currently best scoring model, with an uncostrained costfcn. The structure has an overall bend to it, from the try11-opt2 model. I didn't notice it earlier, but I believe this bend was caused by some strong weights put on try11 in phobic_fit and dry5 to try to pack the structure better. But, I'm glad to see that, with an unconstrained costfcn, try13-opt2 is the best scoring model. Now, I'm really not sure what to do. I'll update the best-models.pdb for Kevin. He might have an idea what to do, if anything, before we submit tomorrow (deadline is Sunday, July 9th). Actually, I checked the all.breaks.gz file and there are three sizeable breaks: T0321.try13-opt2.pdb.gz breaks before (T0321)P207 with cost 2.71589 T0321.try13-opt2.pdb.gz breaks before (T0321)I141 with cost 1.32544 T0321.try13-opt2.pdb.gz breaks before (T0321)E167 with cost 1.27245 that we could try getting rid of. All other breaks are pretty minimal (i.e. less than 1). I started a new run try14, which will try to minimize this break. I've upped the penalty for breaks, gaps, etc. Sat Jul 8 10:13:17 PDT 2006 Kevin Karplus try14-opt2 still has some bad breaks: Conformation[31] T0321.try14-opt2.pdb.gz has 43 breaks T0321.try14-opt2.pdb.gz breaks before (T0321)E167 with cost 1.20615 T0321.try14-opt2.pdb.gz breaks before (T0321)I141 with cost 0.914793 T0321.try14-opt2.pdb.gz breaks before (T0321)C175 with cost 0.720551 gromacs closed some of the littler gaps, but opened up new bigger ones to avoid clashed: Conformation[30] T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz has 19 breaks T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)C142 with cost 1.8626 T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)A176 with cost 1.24646 T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)E167 with cost 1.18682 T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)P207 with cost 1.06357 T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)I141 with cost 1.04853 T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)F220 with cost 1.03101 T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)Q90 with cost 0.712483 I'm still not real pleased with domain 1---we might have done better to generate more models for that domain. Our structure doesn't agree with secondary structure prediction (using either the whole-chain alignments or the domain 1 alignments). The rr predictions for domain 1 are adequately matched, but not great. The conserved residues from M1-K123/ are not clustered. try14-opt2 does look like the best we've got, but I think that try11-opt2 may be too close to be an alternative model. Similarly try12 and try9 may be too close to each other. (Rosetta likes best decoys/T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz ) I'll do one more polishing run, starting from the gromacs0.repack... models. I'll turn up the packing terms to try to make things a little tighter. I'll also up clashes and breaks, but remove deep_knot (which is too slow for use in the optimization). Try14-opt2, try12-opt2, and try4-opt2 look like good choices to submit. I wonder what other 2 models I should submit? Polished single-domains? Polished server models? (The Robetta models score best.) Sat Jul 8 10:37:48 PDT 2006 Kevin Karplus try15-opt2 started on cheep, trying to polish from the gromacs0.repack-nonPC models. Sat Jul 8 10:55:09 PDT 2006 Kevin Karplus V105-K251/ try3 started on lopez to polish single-domain model. Sat Jul 8 11:03:39 PDT 2006 Kevin Karplus M1-K123/ try3 started on lopez to polish single-domain model. Sat Jul 8 11:16:24 PDT 2006 Kevin Karplus The ROBETTA_TS5 model matches secondary structure well for 1-103, but is a bit foamy an unconvincing. Still, we might do well to make a chimera of it and one of our own models---perhaps optimizing it just as a single domain to save some time. Sat Jul 8 11:34:29 PDT 2006 Kevin Karplus I made M1-K123/decoys/chimera-robetta5-try14.pdb.gz from M1-N110 from decoys/servers/ROBETTA_TS5.pdb D111-K123 from decoys/T0321.try14-opt2.pdb I'll try optimizing it with the M1-K123 try1 costfcn (it matches the constraints better than anything else we have). Sat Jul 8 12:26:15 PDT 2006 Kevin Karplus try15-opt2 has smaller clashes and breaks than try14-opt2, but undertaker doesn't like it quite as well with the try15 cotfcn---packing terms and hbonds are worse. Rosetta likes best T0321.try15-opt2.gromacs0.repack-nonPC.pdb.gz Sat Jul 8 12:36:26 PDT 2006 Kevin Karplus M1-K123/ try3 and try4 have finished. V105-K251/try3 has also finished. I'll put them all in superimpose-best.under, look at them and decide which (if any) to submit. For M1-K123, try1, try2, and try3 are quite similar. I'd favor the smaller breaks and clashes of try3-opt2. try4-opt2 scores very well with M1-K123/try3.costfcn, despite not having been optimized for it. I'll do a polishing run on M1-K123/try4 (starting from the gromacs models), using the try3 costfcn. Sat Jul 8 12:49:33 PDT 2006 Kevin Karplus M1-K123/ try5 started on cheep. Sat Jul 8 12:51:25 PDT 2006 Kevin Karplus The V105-K251/ tries 1-3 are all more or less the same. They all have the C-terminal helix rather than beta hairpin, so are in the style of try4-opt2, not the more recent runs. I don't see much point to submitting the V105-K251 models separately. Sat Jul 8 13:03:29 PDT 2006 Kevin Karplus try14/try15 and try12 differ mainly in the placement of the first domain. try12 and try4 differ mainly in the C-terminal hairpin/helix. Sat Jul 8 13:08:23 PDT 2006 Kevin Karplus I'll have a hard time making a chimera of try15-opt2 and M1-K123/try4 (or try5), because there isn't much in common between them, so getting the domains oriented reasonably will be tough. Sat Jul 8 13:20:36 PDT 2006 Kevin Karplus I take that back---we can superimpose fairly well on V91-V97 and crossover between P89 and Q90. Sat Jul 8 13:24:23 PDT 2006 Kevin Karplus For M1-K123 undertaker now likes best try5-opt2, but rosetta still prefers decoys/T0321.try3-opt2.gromacs0.repack-nonPC.pdb.gz Sat Jul 8 13:29:02 PDT 2006 Kevin Karplus I made a chimera decoys/chimera-try15-domain1-try5.models.pdb.gz from M1-P89 of M1-K123/try5-opt2 (optimized from ROBETTA_TS5). It has some bad breaks and clashes (as would be expected from a chimera), so I'll try optimizing it. Sat Jul 8 13:33:14 PDT 2006 Kevin Karplus try16 started on cheep, to optimize the try15/robetta5 chimera. Sat Jul 8 15:15:29 PDT 2006 Zack Sanborn I've looked at the following server models to try to find other possibilities for the first domain: ROBETTA_TS5.pdb (best scoring, Kevin used it ) ROBETTA_TS4.pdb ROBETTA_TS3.pdb ROBETTA_TS2.pdb FUGMOD_TS5.pdb PROTINFO_TS5.pdb Only the ROBETTA models appear to have two domains in their models. It would be impossible to make any chimeras from the unidomain models FUGMOD_TS5 and PROTINFO_TS5. The other ROBETTA models (TS2 -- TS4) have two domains, but I see nothing "better" about the first domains in any of these three models compared to the best scoring server model ROBETTA_TS5. Actually, most of the first domains in these models appear loosely packed and disordered, especially with model ROBETTA_TS3. So, I think Kevin got it right the first time by choosing ROBETTA_TS5. As I see it, the following models will be good ones to submit: try16-opt2 (depending on how it turns out) try15-opt2 try12-opt2 try4-opt2 try11-opt2 (iffy, similar to try14/try15) or something based off an alignment? Sat Jul 8 15:34:29 PDT 2006 Zack Sanborn try16 has completed and try16-opt2 is the second best scoring model using the try16.costfcn. It is beat by try14-opt2. try16-opt2 appears to do better with soft_clashes and breaks than try14-opt2, but doesn't do as well with dry5 and dry6.5 (packing) and hbond_geom_beta* terms. try15-opt2 scores below the try14 models. So, maybe we should submit try14-opt2 and try15-opt2? Sat Jul 8 15:44:50 PDT 2006 Kevin Karplus try14 and try15 are very similar to each other, so there is no need to submit both. Choosing between them depends on which parts of the cost function you want to weight highest. I liked try15-opt2 a bit better, though I'm not sure I can articulate why. (Rosetta also liked it better, because of decreased clashes. try16-opt2 scores almost as well as try14-opt2, despite less polishing. I don't like the disulfide bridge in try16-opt2---it seems unlikely given the unpaired cysteines in the rest of the protein. The cost function should incude mabye_metal, but not maybe_ssbond. Sat Jul 8 16:16:50 PDT 2006 Kevin Karplus I'll try optimizing try16 with maybe_metal turned on, but not maybe_ssbond. I also added in some constraints (from try16-opt2 sheets and try16-opt2.helices) to keep the character the same, but to encourage improving the sheets and helices. try17 started on cheep. Sat Jul 8 16:56:44 PDT 2006 Kevin Karplus try17 is making minor improvements, but is not getting rid of the disulfide. Sat Jul 8 17:27:34 PDT 2006 Kevin Karplus rosetta doesn't even think that try17 improved over try16, as clashes went up a little. Sat Jul 8 17:35:29 PDT 2006 Kevin Karplus I'll submit try14-opt2 try15-opt2 try17-opt2 try12-opt2 try4-opt2 unless someone suggests a replacement for try15, which is too similar to try14. Sat Jul 8 17:47:34 PDT 2006 Kevin Karplus So submitted.