Thu May 18 09:57:00 PDT 2006 T0293 Make started Thu May 18 09:59:20 PDT 2006 Running on shaw Thu May 18 14:52:13 PDT 2006 Kevin Karplus The t2k alignment finds 3 PDB files: 1nv8A 1sg9A 1vq1A (all hemK) The t04 alignment finds 10 PDB files. The t06 alignment finds 44 PDB files. There are at least 100 good templates in the superfamily. It looks like the family is c.66.1.30 The top templates look like 1t43A 2b3tA 1sg9A 1vq1A 1nv8A Make started Thu May 18 16:01:18 PDT 2006 Running on orcas.cse.ucsc.edu I killed the jobs on shaw and restarted them on orcas, because shaw seemed to be thrashing in muscle. (I also removed the muscle alignments from the pairwise alignment targets.) Thu May 18 20:00:10 PDT 2006 Kevin Karplus This looks like fold recognition with easy-to-find templates, as BLAST just misses getting significant hits: T0293 1vq1A 30.00 80 50 3 67 146 139 212 0.371 30.42 T0293 1sg9A 30.00 80 50 3 67 146 127 200 0.371 30.42 T0293 1nv8A 30.00 80 50 3 67 146 129 202 0.371 30.42 T0293 1s6yA 22.33 103 73 2 48 143 223 325 7.0 26.18 Make started Fri May 19 09:03:20 PDT 2006 Running on lopez.cse.ucsc.edu The undertaker run seems to have died (or been killed) without an error message in the try1.log file, so I'm running the make again. Fri May 19 15:43:00 PDT 2006 Kevin Karplus The try1 model seems to be coming from 1ej0A, though there may be assembly of pieces from several templates: 1nv8A, 1sqgA, 2as0A, 1h1dA, 1ws6A, 1ej0A The e-value seems to keep getting worse as later alignments are chosen. This looks like a case where the good alignments were damaged by undertaker. The top templates for T0293.best-scores.rdb overlap at least partially with the top blast hits. 1nv8A 284 5.3000e-31 8.0162e-14 c.66.1.30 86224 1dusA 195 6.8200e-21 1.4625e-13 1dusA c.66.1.4 34182 1jg1A 216 4.0500e-19 1.3858e-10 c.66.1.7 66661 1o54A 266 7.3000e-21 1.5390e-10 c.66.1.13 92482 1dl5A 318 5.2500e-20 2.4539e-10 1dl5A c.66.1.7,d.197.1.1 64720,64721 1f3lA 322 5.4900e-20 3.0416e-10 1g6q2 c.66.1.6 59630 1im8A 226 2.1200e-20 1.2693e-09 c.66.1.14 66212 1i9gA 265 7.3300e-19 2.2275e-09 c.66.1.13 62090 Perhaps we should limit the set of alignments to choose from---perhaps only the top 10 or 20 hits with the HMMs? Thu May 25 15:08:11 PDT 2006 Kevin Karplus Need to look at templates and try to figure out what to do on this one. T0293 Do a run with just the top few templates and blast hits. Sun Jun 4 12:19:44 PDT 2006 Kevin Karplus There seems to be a large N-terminal region that is not matched by the templates. We might want to do two overlapping subdomains V1-E49 and D28-D250. I started both of these running on lopez. Sun Jun 4 18:23:43 PDT 2006 Kevin Karplus OOPS. I forgot to check which split-into-domains was being called---this Makefile still had the old one, so the try1 runs were being done with the old methods from pcpe/starter-directory. I'll fix the Makefiles and rerun make (after moving try1 into a new directory). The V1-E49 run had completed, so I looked it. It is predicting an alpha-beta domain instead of the all-alpha stuff we saw before. I'm not sure I believe it though. Sun Jun 4 21:41:02 PDT 2006 Kevin Karplus I reran V1-E49, and the new prediction (also an alpha-beta domain) looks a bit more convincing---at least it ends with a helix, so may be able to link up with the second domain. Mon Jun 5 16:25:14 PDT 2006 Kevin Karplus The D28-D250 part was rerun also. George just fixed the handling of residue-residue predictions for chains that don't start with residue 1, so I remade the rr predictions for D28-D250. This did not change the results of constraints, but did eliminate the error messages. Mon Jun 5 17:08:51 PDT 2006 Kevin Karplus I tried making a chimera of try1-opt2 from V1-E49 and D28-D250, crossing over at P39. This chimera1 has terrible clashes, but may be worth trying to optimize. I'll try this as try2 on camano. Tue Jun 6 12:16:24 PDT 2006 Kevin Karplus try2-opt2 has done a decent job of fixing up the chimera. Clashes and breaks are definitely reduced relative to try1-opt2. Constraints are not well met, probably because they are inconsistent. try2-opt2 also scores best with the unconstrained costfcn. For try3, I should probably polish either without constraints, or with constraints taken from try2-opt2. (Hmm, trying to score all the server models crashed undertaker again. This time by CaspIta-FOX_TS1) Tue Jun 6 13:20:10 PDT 2006 Kevin Karplus After commenting out the CaspIta-FOX models, I couls score all the servers with the unconstrained costfcn. try2-opt2 still scores best, and SAM_T06_server is the top server model, followed by ROBETTA_TS5 Pmodeller6_TS1, ROBETTA_TS4, ... Tue Jun 6 13:41:22 PDT 2006 Kevin Karplus Looking at the superposition of many models, including ours and some of the server models, I'm not so pleased with try2-opt2. It has moved too far from the models from alignments. (It is a little hard to tell, since the superposition is poor---it is getting trapped by the differences in the N-terminal residues.) Wait, that't not the problem---I was starting by a superposition of P39. I've changed to using a pretty much unchanged portion (G65-G85) to initialize the superposition. Tue Jun 6 13:54:52 PDT 2006 Kevin Karplus Now the models superimpose well, but the two problem areas are the N-terminal region (for which the SAM_T06_server model looks most promising) and N143-G181. The loops produced by the servers in this region look pretty trashy. Wed Jun 7 17:27:05 PDT 2006 Kevin Karplus Looking at the superposition of several models, I see some problems with some of the models around residue E185. try1-opt2, try2-opt2, D28-D250/try1-opt2, and the third alignment from undetaker-align.pdb (to 1jg1A) seem to be misaligned. I should probably remove any sheet constraints on this region that come from 1jg1A and reoptimize. Incidentally C140, C142, and C206 all cluster, so may form a metal-binding site, though there is no corresponding site in the templates and none of these is highly conserved (C140 is somewhat conserved in t2k, but IVTL are all more common). Wed Jun 7 18:32:11 PDT 2006 Kevin Karplus I've created a try3.costfcn and will do a try3 run from the alignments (including the subdomain alignments) on shaw. If it does a decent job, I'll submit try3-opt2 and try2-opt2 tomorrow morning, do some polishing and have the final submission ready by Monday, when it is due. Wed Jun 7 22:08:25 PDT 2006 Kevin Karplus try3-opt2 scores worse than try1-opt2 and try2-opt2. The problem is that the hairpin at T230-W246 has gotten detached from the sheet. It seems to be attached properly in SAM_T06_server_TS1, so I'll pick up sheet constraints from there for try4. Thu Jun 8 09:20:47 PDT 2006 Kevin Karplus try4-opt2 is better, but still needs some work. In particular, I suspect that the predicted N-terminal strands may be part of the big sheet, perhaps adjacent to the C-terminal strands. Still, there is a soft deadline this morning, so I'll make a submission. Thu Jun 8 17:37:34 PDT 2006 Kevin Karplus Firas used ProteinShop to fix up the N-terminus where I thought it ought to go (decoys/fromTry4.proto.pdb). He and I then made up sheet constraints (partly borrowed from servers/SAM_T06_server_TS1) to fix up the bulgy edge strand and connect to the newly added piece. We are trying to polish it up as try5 on the farm cluster. If it doesn't clean up the bulgy sheet, we may want to do a cut-and-paste from servers/SAM_T06_server_TS1 to make a chimera to optimize. Fri Jun 9 08:48:05 PDT 2006 Kevin Karplus try5-opt2 did not clean up the bulgy strand. Looking at ROBETTA_TS5 and the second undertaker-align model, I see that try5-opt2 may have the strand upside down. I'll try cutting and pasting in a strand from the second undertaker-align model. I made a chimera of try5-opt2 and the second model in undertaker-align, copying in V226-C235. The breaks are bad, but they should be fixable, especially if I get the sheet constraints right. Oops, that chimera has a bulgy strand, which will be difficult to line up with. Let's try making a chimera with the SAM_T06_server_TS1 model, copying V229-Q236 Now we need to get the sheet constraints right. From SAM_T06_server_TS1.sheets: SheetConstraint (T0293)W203 (T0293)G209 (T0293)S247 (T0293)R241 hbond (T0293)Y204 1 SheetConstraint (T0293)V229 (T0293)C235 (T0293)W246 (T0293)M240 hbond (T0293)T230 10 Added by hand SheetConstraint T230 F234 E32 P36 hbond Y231 30 Fri Jun 9 11:32:53 PDT 2006 Kevin Karplus try6-opt1 gets a new best score on the try6 costfcn, so it seems that we can get a decent model out the chimera. There are still some problems with very bad breaks before R238, G237, and V226, which do not seem to closing in the opt2 part of the run. Other little problems: K59 seems to be in the way of V1 and F5 packing in closely. L34 may need to be closer to L51. The buried residues around G69, I75, V93, F147, F186 seem a bit too exposed---as if the protein had flexed open a bit. Also the inserted loop from K154 to G181 is probably all junk. Fri Jun 9 12:43:43 PDT 2006 Kevin Karplus Sure enough, try6-opt2 did no further reduction of the breaks. Even more annoyingly, it did not get the sheet constraints for V229-Q236 fully satisfied. I'll try another run with breaks turned up and the unsatisfied constraints strengthened. Interestingly, the try6-opt2.repack-nonPC model has significantly poorer satisfaction of the constraints, despite an identical backbone. I think that undertaker may have picked some rather "off" rotamers that put CB is funny places to try to satisfy the constraints. I'll try the next optimization from the rosetta-repacked model, to avoid the bad CB positioning. Fri Jun 9 17:10:54 PDT 2006 Kevin Karplus try7-opt2 has closed the worst breaks---the following are the worst ones left: T0293.try7-opt2.pdb.gz breaks before (T0293)E21 with cost 1.43015 T0293.try7-opt2.pdb.gz breaks before (T0293)G209 with cost 1.07871 T0293.try7-opt2.pdb.gz breaks before (T0293)T232 with cost 0.798814 T0293.try7-opt2.pdb.gz breaks before (T0293)R241 with cost 0.583598 There are still some problems getting the beta sheet to be well formed, because the strand containing T232 is offset a bit from where it should be relative to L244. Sliding the sheet containg the N-terminus and the strand with T232 over about 1 residue so the H-bonds line up right would probably fix the problem. Perhaps Firas could do that with ProteinShop? Moving the first helix to reduce the exposed hydrophobics might also be easier with ProteinShop. In the meantime, I'll try really upping the Hbond constraints for the hbonds I want, to see if undertaker will do it if forced. (Running as try8 on cheep.) Fri Jun 9 20:45:56 PDT 2006 Kevin Karplus try8 closed gaps a little more, but did not improve the constraints much. Sun Jun 11 15:09:10 PDT 2006 Firas Khatib I tried moving the T232 sheet over and saved it 4 times along the way to do minimal changes (and therefore minimal damages): shift232over1FromTry8.renum.pdb = try9 (running on orcas) shift232over1.1FromTry8.renum.pdb = try10 (running on orcas) shift232over1.2FromTry8.renum.pdb = try11 (running on whidbey) shift232over1.2andMoveH33-37.renum.pdb = try12 (running on whidbey) It would be nice if we could get Proteinshop to draw hydrogen bonds since with Proteinshop you can see the hydrogen bond sites and hydrogen cages! For now, I moved it in baby steps and will run 4 tries to see if any of them help. I will move the first helix afterwards. I did not want to move the sheet containing the N-terminus if I don't have to because it has 6 hydrogen bonds already and I don't want to break those! Sun Jun 11 17:41:10 PDT 2006 Firas Khatib I used Proteinshop while the 4 tries were running to modify the T232 sheet into a full sheet (since you can change residue ss assignments in Proteinshop) and I was able to make 6 hydrogens bonds between the T232 strand and the L244 strand! I lost a few hbonds with the 33-37 strand but will try to fix that next. This new Proteinshop attempt is: shift232over2FromTry8.renum.pdb I am running this as try13 on shaw. Sun Jun 11 20:20:06 PDT 2006 Kevin Karplus All the new runs seem to be making good progress, but it is harder to say which one will end up the best. Currently, try8-opt2 still scores best, but try11-opt1 and try10-opt1 are doing well enough that they may be able to compete. I'll probably want to do a polishing run from all models (at least all the new models) when the current runs have finished. Sun Jun 11 21:55:13 PDT 2006 Kevin Karplus Unfortunately, try13 has "try9" inside it everywhere, so whatever it did was either stepped on by try9 or is now called try9. Sigh, and that was supposed to be the most promising one. I will do the global replace in try13.under and run it again, perhaps with slightly shorter optimizations, so that there will be time for a polishing run. Actually, I don't think it will fit the costfcn very well, since that is requesting that T230 be aligned with A245, but Firas has slid it the other way to align with S247. I suppose that we could try that alignment also, though I don't like how it sticks V229 out. I made a try14-costfcn that has the alignment Firas was trying to make, and it scores try9-opt2 best (which may actually have come from the mislabeled try13 run). I will do another optimization run with try14 (though I don't really believe in it), and submit try8 and the best model that comes out with Firas's alignment. Sun Jun 11 22:20:58 PDT 2006 Firas Khatib wow... it seems I have totally botched the last hopes for this target. There wasn't enough slack on the 225-228 end of the strand, which is why I shifted the strand the other way, since there was a lot more slack from 234-241, but I see what you mean about the V229. Sun Jun 11 23:33:54 PDT 2006 Firas Khatib I tried to line up T230 and A245 as well as rotate the N-terminal helix to reduce the exposed hydrophobics. decoys/lineup230and245take1.2rotateNterminusAndMove.renum.pdb I also tried to slide the 33-37 helix back (since I moved the 230-245 one) but this was more difficult: decoys/lineup230and245take1.9rotateNtermSlide33-37helix.renum.pdb I will run this one as try15, in case it is better. and I will run lineup230and245take1.2rotateNterminusAndMove.renum.pdb as try16 which I hope will be the best! Mon Jun 12 00:46:28 PDT 2006 Kevin Karplus try14-opt2 is based on try9-opt2, but does not really satisfy the sheet constraints for either the try14 or the try9-13,try15-16 cost functions. I should probably try the optimization with this alignment over again with just the ReadConformPDB shift232over2FromTry8.renum.pdb starting point, since optimizing from all was not very successful in making a clean sheet. (running this as try17 on cheep) So in the morning, I'll have to choose between whatever scores best with try16.costfcn and whatever scores best with try17.costfcn, perhaps doing a final polishing pass on that. I should probably submit as model 2 whatever scores best with the other costfcn, as I'm not that certain of the correct alignment. Mon Jun 12 06:08:13 PDT 2006 Kevin Karplus I don't like the way try15-opt2 has swung out the initial helix, so discard that one. try17-opt2 does a better job of covering predicted buried residues than try14-opt2, so I prefer it, even though try14-opt2 has slightly better H-bonds. Actually, the difference is *which* hbonds are present for strand T230-G237, since neither model manages to get the H-bonds on both sides. I will submit ReadConformPDB T0293.try17-opt2.pdb ReadConformPDB T0293.try16-opt2.pdb ReadConformPDB T0293.try8-opt2.pdb ReadConformPDB T0293.try4-opt2.pdb ReadConformPDB T0293.undertaker-align.pdb model 1 (from 1nv8A) Mon Jun 12 06:43:34 PDT 2006 Kevin Karplus submitted. Wed Jun 14 10:09:41 PDT 2006 Kevin Karplus Solution released as 2h00A. Wed Jun 14 14:34:42 PDT 2006 Kevin Karplus Foo! we did *not* do well on this one, AND the server did better than we did by hand. Our best model (try15-opt2) is not one we submitted, because I didn't like it. The model from SAM_T06_server_TS1 was better than any of our hand submissions. So much for my understanding of protein structure. The N-terminus was not at all like what we got from handling it as a subdomain. The big insertion that we never touched (between F151 and G177) was disordered anyway, so it was just as well we didn't fuss with the model there. The best server model was Zhang-Server_TS1, but we would have done well to copy the best-scoring server with the unconstrained costfcn (other than ours), ROBETTA_TS5, which scored 6th among the servers with the evaluation function I'm using and best with GDT. Using just GDT, SAM-T02_AL5 is the best of our servers, but is still not very good. Fri Jul 14 11:38:38 PDT 2006 Kevin Karplus Using the improved evaluation in evaluate.unconstrained.pretty, the SAM_T06 server is 30th of 53 TS1 models from servers---pretty feeble! (real-cost 0.24, while ROBETTA_TS5 is -0.24, and Zhang-Server_TS1 is -0.21) Our best model was try15-opt2 (0.23) and our best submitted model was model4 (0.28). We did worse by hand than the median server! The sheet is more twisted and curled than we made it.