Wed Jul 5 10:51:04 PDT 2006 T0359 Make started Wed Jul 5 10:53:31 PDT 2006 Running on vashon.cse.ucsc.edu Wed Jul 5 11:28:22 PDT 2006 Kevin Karplus BLAST gets excellent full-length hits in PDB 1d5gA 45% over 92 residues 7e-17 1vj6A 46% over 91 residues 1e-16 1gm1A 46% over 91 residues 1e-16 1wf8A 49% over 90 residues 2e-15 ... Wed Jul 5 19:07:27 PDT 2006 Kevin Karplus HMMs also have excellent hits: 2fe5A, 2fneA, 1g9oA, ... all b.36.1.1. Wed Jul 5 19:59:02 PDT 2006 Kevin Karplus try1 failed, because of bug in undertaker. I'll fix it and run again. Wed Jul 5 22:23:24 PDT 2006 Kevin Karplus All alignments very similar. Just need to close gaps and polish. try2 started on lopez to do initial gap closing (still with contraints). Wed Jul 5 22:49:30 PDT 2006 Kevin Karplus try2 restarted on lopez (I had forgotten to make read-pdb.under first). Wed Jul 19 13:11:27 PDT 2006 George Shackelford try2 still has a break, but so does T0359.undertaker-align.pdb model 3. If I can get it out of the mix maybe we can get a try with no breaks. Kevin says that won't work because there is an insertion/deletion difference. Hmm. Let's look at the alignments. Yes, there is basically an insertion at E32. There is also one or two short sheets that seem to be complicating the break. I think I'll just first crank up the break and clash costs and see what I can get. try3 running on vashon Wed Jul 19 17:08:04 PDT 2006 George Shackelford So far as I can see try3 is not fixing the break. As I look at the insertion I can see where undertaker is trying to make a short sheet that is probably based on one of the templates. It doesn't work here because of the insert and that is causing the break. I am running try4 by including the lines of read-pdb.under and commenting out all except those with gromacs. Only the gromacs have any solution to fixing the break. try4 running on vashon Sat Jul 22 08:45:18 PDT 2006 Kevin Karplus Actually, try3-opt1 seems to be doing the best. I'm starting a polishing run with an cost function that has no constraints, that turns the beta Hbond weights down, and turns the the break and soft_clashes weights up. (try5 started on cheep). Sat Jul 22 08:48:59 PDT 2006 Kevin Karplus I'm a bit worried about the tiny hairpin that is not in most of the alignments: SheetConstraint (T0359)K12 (T0359)N13 (T0359)L17 (T0359)G16 hbond (T0359)N13 1 The n_sep and o_sep alphabets do not predict it. If anything, they have Hbond L10.O I19.N Hbond G18.O L10.N Hbond G18.O T11.N Hbond T20.O T11.N Hbond I19.O K12.N in that region. I'll start a prediction from the alignments with the sheet constraints and helix constraints of try3, but *without* this hairpin. Sat Jul 22 09:03:41 PDT 2006 Kevin Karplus try6 started on camano. try5-opt2 scores best with try5=try6 costfcn, and rosetta likes try5-opt2.gromacs0.repack-nonPC, but rosetta likes try6-opt2.gromacs0.repack-nonPC even better. Sat Jul 22 20:14:01 PDT 2006 Kevin Karplus I'll submit ReadConformPDB T0359.try5-opt2.pdb ReadConformPDB T0359.try6-opt2.gromacs0.repack-nonPC.pdb ReadConformPDB T0359.try3-opt1.pdb ReadConformPDB T0359.try1-opt2.pdb InFilePrefix ReadConformPDB T0359.undertaker-align.pdb model 1 # from t2k 2fe5A try1< alignments (2bygA) try3 < try2 < try1 try5 < try4-opt2 < try2-opt2.gromacs0 try6 < alignments (2bygA) Sat Jul 22 20:37:14 PDT 2006 Kevin Karplus Submitted with comment We did very little work on this target, just running optimizers to try to close the small residual gaps. Model 1 is try5-opt2, optimized by undertaker form all earlier tries, but mainly from try4-opt2 (in turn from try4-opt2, from try2-opt2.gromacs0, from try1-opt2, from alignment to 2bygA). try5-opt2 scores best with a cost function that emphasizes closing gaps and avoiding clashes. Model 2 is try6-opt2.gromacs0.repack-nonPC, was optimized by undertaker from alignments using the same cost function as for try5. The undertaker model was then reoptimized by gromacs (which is good at removing tiny clashes) and had its sidechains repacked by rosetta. It is rosetta's favorite of the backbones for which it repacked sidechains. Model 3 is try3-opt1, the best-scoring with our default unconstrained costfcn. This costfcn may have a bit too much reward for beta-sheet-forming H-bonds. Try2 was optimized from try2-opt2, from try1-opt2. Model 4 is try1-opt2, the fully automatic model. It was optimized by undertaker from alignments (using mainly 2bygA). Model 5 is sidechain replacement by SCWRL on an alignment to 2fe5A, the top-scoring template with our HMMs. ------------------------------------------------------------ Tue Jul 25 15:00:00 PDT 2006 George Shackelford I am going to use gromacs to try and close the gap. The current application of gromacs is basic energy minimization (em) in vaco. I am going to use a set of md (molecular dynamics) runs that are a simulation of the protein in water at 300K and 1 atmosphere pressure for 100ps (picoseconds). The expectation is to close the gap under solvent conditions. The process first does an em in vaco for 400 steps. Next we create a box around the dimensions with 9 Angstroms of extra space outward. This box is filled with H20. The atoms of the protein are position restrained during a 10ps period to allow the water to "soak" into position w.r.t. the protein. The results is then put through an md that simulates 100ps in 50000 steps. An rms based average pdb file is created from the position of atoms in the time interval 70-100ps. This average structure is usually crude; we use this average pdb (which still contains the water molecules) and put it through another 400 steps of em. The final positions of the protein atoms are then substituted in the original target pdb to create a final gromacs1.pdb. This procedure is essentially the same as presented in the GROMACS tutorial. The mdp files, a shell script for running this procedure are available in the gromacs1 directory in the casp7 directory. The script takes two arguments: the first is the target ID, the second is the try identifier, i.e., shakebake.sh T0359 try3-opt2.gromacs0 I expect this procedure has limited usage: one is to close stubborn gaps and the other is generating "refinement" submissions. Even for small proteins this is CPU intensive and takes several hours; it can take 10 hours for proteins with ~250 residues. I should note that this was developed around GROMACS 3.3.1. The current version on the server is GROMACS 3.2.1. I have compiled a version local to my directory and I have coded the script to only use the programs in my local directory. Now is not the time to update GROMACS on the server. Tue Jul 25 18:48:51 PDT 2006 George Shackelford ARRGH! I forgot that gromacs warps the structure so it scores poorly. More gromacs means it score very poorly. Perhaps if it is repacked but I'm not going to spend more time on this. We will go with what we've already sent in. Try6 does have the break fixed and that will have to do. Tue Jul 25 19:33:42 PDT 2006 Kevin Karplus George does not seem to have left the script in this directory, and he did not modify the Makefile, so I have no idea what commands exactly he ran, nor where he left the result. There is a T0359.try5-opt2.gromacs1.pdb.gz file in the decoys directory, which with the standard Makefile would mean a quick energy minimization with forcefield 1. I'm *guessing* that this is the model George is referring to as the result of the md run (I have no idea what forcefield George used). It does indeed score terribly---md simulation is much worse than energy minimzation, resulting in large clashes and large breaks, as well as terrible sidechains. We could try doing a reoptimization from this terrible md run, regarding it as just noise injected to the model to make it score badly (that's pretty much my opinion of molecular dynamics as a method anyway). After repacking the sidechains, rosetta likes try5-opt2.gromacs1.repack-nonPC best. Tue Jul 25 19:53:39 PDT 2006 Kevin Karplus Because rosetta likes it (despite what undertaker sees as bad clashes and breaks), I'll try polishing try5-opt2.gromacs1.repack-nonPC as try7 on cheep. Tue Jul 25 20:23:52 PDT 2006 Kevin Karplus I just realized that try6-opt2 had never been polished---it came directly from alignments with constraints. I'll try polishing it as try8 (starting from just try6-opt2.gromacs0.repack-nonPC) on lopez. try7 just finished. Rosetta does not like it as well as the md-optimized backbone, but almost as well. Undertaker likes it a *lot* better than the md "optimized" backbone. I'll do a polishing run (try9) with the same costfcn as try8, starting from the repacked models of try7-opt2. Tue Jul 25 20:32:12 PDT 2006 Kevin Karplus try9 started on cheep. When try8 and try9 finish, I'll pick new best models to submit and do the final submission. Tue Jul 25 20:59:45 PDT 2006 Kevin Karplus try9 finished. try9-opt2 scores almost as well as try5-opt2 with the try8=try9 costfcn, and try9-opt2.gromacs0.repack-nonPC scores almost as well as try7-opt2.gromacs0.repack-nonPC with rosetta. It looks like it may be a good compromise. I *do* wonder why undertaker hates decoys/T0359.try5-opt2.gromacs1.repack-nonPC.pdb.gz, which rosetta loves, and which of the two programs has the error. In the non-repacked model, the problem is mainly self-bumps and bond-length errors, and the problem persists in the backbone even after repacking. try8-opt2 scores better with the try8=try9 costfcn than try9-opt2 (better than any but try5-opt2), though it has bigger breaks and clashes than try5-opt2. try8-opt2.gromacs0.repack-nonPC scores slightly better with rosetta than try6-opt2.gromacs0.repack-nonPC, which it was polished from, so it should replace try6. Tue Jul 25 21:27:28 PDT 2006 Kevin Karplus Submitted with ReadConformPDB T0359.try5-opt2.pdb ReadConformPDB T0359.try8-opt2.gromacs0.repack-nonPC.pdb ReadConformPDB T0359.try9-opt2.gromacs0.repack-nonPC.pdb ReadConformPDB T0359.try3-opt1.pdb InFilePrefix ReadConformPDB T0359.undertaker-align.pdb model 1 # from t2k 2fe5A There are some differences in several of the loops (though try3-opt1 is a bit too close to try5-opt2).