Mon Jul 7 09:25:00 PDT 2008 T0487 Make started Mon Jul 7 09:25:26 PDT 2008 Running on cheep.cse.ucsc.edu Mon Jul 7 11:27:36 PDT 2008 Kevin Karplus T0487 is a long protein (685 residues) and so will almost certainly have to be broken into domains. We are getting good hist to both b34.14.1 and c.55.3.10 domains, and the two top hits (1yvuA and 1u04A) have both of those domains, so maybe this is a full-length homology model. Since 1u04A is a full-length Argonaute from Pyrococcus furiosus and the target is the Thermus termophilus Argonaute, it seems likely that this is a complete match. The domains are a PAZ domain and a PIWI domain. Tue Jul 8 02:34:47 PDT 2008 Kevin Karplus try1-opt3 has some rather bad breaks. One group are in the area H621-E629, and probably require peeling the last strand off the sheet. There are numerous other bad breaks, so I'll reoptimize with breaks (and soft_clashes) turned up by a factor of 4. The try1 optimization took 9 hours, so I'll reduce the number of iterations for try2, and try to focus the operations more on break reduction. I'll also add the superfamily alignment reading and remove tha all-align alignment. There will probably be too much time spent running scwrl, so I might have to turn to the noscwrl scripts for future runs. (I really need to implement saving and restoring alignment and fragment libraries!) Tue Jul 8 10:17:03 PDT 2008 Kevin Karplus try2 didn't run, because I had a typo (1uo4A for 1u04A), so my getting up in the middle of the night to start the run was wasted. I'll start it again on peep. Tue Jul 8 12:13:56 PDT 2008 Kevin Karplus try2 failed with an assertion failure! undertaker: ../ultimate/src/Transform/Transform.h:83: bool Transform::OK() const: Assertion `1.-ERR_LIMIT < rot_mag2 && rot_mag2 < 1.+ERR_LIMIT' failed. I've never seen that assertion failure before---I wonder what triggered it. I'll probably have to do some debugging, but it ran for about an hour before crashing, so debugging will be difficult. I think I'll first try just running again and hoping for a different seed to avoid the bug. I'll save the log of the failed run, so I can restore the seed 1216450465 and (with luck) replicate the error. Tue Jul 8 13:07:25 PDT 2008 Kevin Karplus The SAM-T02-server and SAM-T06-server runs failed because of the length limitations I restarted the SAM-T02-server run in /projects/compbio/tmp/target02-query/target02-query-1215447404-4626 but I'll probably have to submit the MODEL[1-5].al files manually, since the mail in that old server is not in the Makefile. Tue Jul 8 18:44:27 PDT 2008 Kevin Karplus The SAM-T02-server models were submited and accepted. The SAM-T06-server run is still going forward---it may take until the deadline! The try2 run closed gaps and removed clashes somehwat, but still more is needed (the try2-opt3.gromacs model scores better with the try2 costfcn). Tue Jul 8 23:37:54 PDT 2008 Kevin Karplus There seem to be substantial phase differences between try1+2+3 and the alignment to 1u04A. Perhaps I should break this into domains, work on each domain separately, then put them back together by superimposing on the whole model. I might be able to pull out R25-L98, I173-P263, G323-V463, G459-V685. Fri Jul 11 17:20:41 PDT 2008 Kevin Karplus MQAC likes Zhang-Server_TS3 0.502 Zhang-Server_TS4 0.497 pro-sp3-TASSER_TS3 0.494 Zhang-Server_TS2 0.482 GS-KudlatyPred_TS1 0.477 MUSTER_TS1 0.477 GS-KudlatyPred_TS2 0.475 GS-KudlatyPred_TS3 0.474 circle_TS2 0.470 circle_TS3 0.470 MQAU likes SAM-T08-server_TS1 0.580 SAM-T08-server_TS2 0.579 RAPTOR_TS1 0.577 GS-KudlatyPred_TS1 0.569 GS-KudlatyPred_TS3 0.568 GS-KudlatyPred_TS2 0.567 MUSTER_TS1 0.567 BAKER-ROBETTA_TS3 0.565 pro-sp3-TASSER_TS3 0.565 BAKER-ROBETTA_TS4 0.565 Mon Jul 14 10:50:44 PDT 2008 Kevin Karplus The best scoring models with try1.costfcn are MQAU1, MQAC1, try3-opt3.repack-nonPC. The MQAU1 and the MQAC1 models come from GS-KudlatyPred_TS2. They also score well with try3.costfcn and rosetta (at least the gromacs0.repack-nonPC versions do. Mon Jul 14 11:15:59 PDT 2008 Kevin Karplus In a number of ways I like try3 better than MQAU1. Perhaps I should do some patching of its bad points, though, to fix things up. For example, The bad breaks at P44, L45, L46 could be patched with D34-Q49, A623,A632 could be patched with H621-A632, G482, R482, E483, S484 with G480-G489, E76, G77, T78, L90, Y91, P100, K101 with W67-P103 F338? H382? H445, R446, W447? G499 and H500? G126, V127, W128? P26,W27? C175, E176? K329? R199,R200? A278? H256, L260, L261, V262? L148, G149? A589? W437? V666? This will take forever to optimize, unless I break it into domains. Tue Jul 15 16:13:41 PDT 2008 Kevin Karplus I started subdomain predictions for R25-L98, I173-P263, G323-V463, G459-V685. When they are done, I'll chop out the corresponding parts of try3 and MQAU1 and try optimizing each domain separately with high crossover. Tue Jul 15 16:44:56 PDT 2008 Kevin Karplus These subdomains seem to be too small to pick up the signals that lead to the recognition of 1u04A, 1w9hA, 1si2A, and 1r4kA (c.55.3.10 and b.34.14.1 families). The G459-V685 does get good hits on the c.55 fold: 1yvuA, 1u04A, 1w9hA, and (weakly) 2fsjA. I'm sending the whole try3-opt3 model to VAST, to pick out there the structure comes from. Your VAST Search job was submitted at 07/15/2008 19:45:09(EDT). Request ID: 360713589810141423 Tue Jul 15 17:20:59 PDT 2008 Kevin Karplus The best whole-length VAST hits are to 1yvuA and 1z26A, covering residues E8-F683. 1ytuB covers K320-G670 1z26A covers S327-E483 For subdomains identified by VAST: 1-12, 303-331, 571-626 1z26A, 1yvuA 13-21, 125-174, 273-302 2qt7A, 2cxcA, 2v7bA 175-199, 257-272 all too short P26-R120 1z26A, 3cjtA, 3cjsA E203-H256 2ekmA, 1rlhA, 1tr8A S328-E438 1z26A, 1ytuB, 2p9hA, 1zrhA R446-586,630-F683 1z26A, 1yvuA, 1asuA, 1ekeB, 1ilyA, 1wn1A, 2qh9A Tue Jul 15 17:39:52 PDT 2008 Kevin Karplus I started a subdomain prediction for G302-V685, which should be a whole fold except for one strand that comes from M1-N12. Tue Jul 15 17:43:56 PDT 2008 Kevin Karplus I also started N12-P306 for the first half of the protein. Tue Jul 15 23:29:16 PDT 2008 Kevin Karplus The 4 shorter subdomain runs are done. R25-L98 didn't find any strong hits (1o59A at E-value 65) The R25-L98/try1-opt3 model looks plausible, but no better than the regions cut out of try3-opt3 and MQAU1-opt3 I173-P263 also has no strong hits (1rmwA at E-value 27.8) The I173-P263/try1-opt3 model is not even as reasonable as the regions cut from try3-opt3 and MQAU1-opt3 G323-V463 has a weak hit to c.55.3.10 domains (1w9hA E-value=1.29, 1u04A Evalue=1.426). The try1-opt3 model agrees fairly well with the regions cut from try3-opt3 and MQAU1-opt3. It may be worthwhile to try doing an optimization in this subdomain, to get the loops better solved. Tue Jul 15 23:43:23 PDT 2008 Kevin Karplus G323-V463/try2 started on the moai cluster, with sheet and helix constraints from G323-V463/try1 G459-V685 has strong hits for c.55.3.10 domains (1u04A and 1w9hA < 1e-25), and weak hits for other c.55.* folds. The different predictions are in good agreement. In fact, the from-MQAU1 model scores better than the G459-V685/try1-opt3 model with the try1 costfcn. Tue Jul 15 23:52:04 PDT 2008 Kevin Karplus G459-V685/try2 started on the moai cluster. N12-P306 has moderate hits to 1u04A (2.4e-04), but has not finished building its try1 model yet. G302-V685 has strong hits to c.55.3.10 domains (1u04A 1.8e-40, 1w9hA 3.6e-36) but hasn't finished building its try1 model yet. I expect this model to be close to the ones for the whole-length protein, with perhaps slightly better loops. Thu Jul 17 12:03:52 PDT 2008 Kevin Karplus The N12-P306 and G302-V685 models do look pretty good. I should do a little clash and break removal in each, and then make a chimera, superimposing them on either MQAU1 or try3. First I should re-extract sheets from them, as I've done some bug fixes to the algorithm to make it extract sheets that are a little more messed up. Thu Jul 17 12:15:37 PDT 2008 Kevin Karplus try2 runs started in both N12-P306 and G302-V685. Thu Jul 17 16:25:01 PDT 2008 SAM-T08-MQAO hand QA T0487 Submitted Thu Jul 17 16:25:01 PDT 2008 SAM-T08-MQAU hand QA T0487 Submitted Thu Jul 17 16:25:01 PDT 2008 SAM-T08-MQAC hand QA T0487 Submitted Thu Jul 17 16:35:58 PDT 2008 Kevin Karplus The N12-P306/try2 and G302-V685/try2 runs both reduce breaks and clashes, but the gromacs0 optimization improves them further, so clearly more clash reduction is needed. The sheet scores are not great for either one, though better than the try1 runs. I'll try combining them to make an N2-C2 chimera, but which model should I use as a template to align the domains? Perhaps whichever scores best with the sheets from N2 and C2? Thu Jul 17 16:48:15 PDT 2008 Kevin Karplus I made a try4 costfcn with N2 and C2 sheets and helices, and MQAU1-opt3.gromacs0.repack-nonPC scores best, so I'll use that as the template. It looks like I should take M1-L121 MQAU1-opt3.gromacs0.repack-nonPC R122-E176 N12-P306/try2-opt3.gromacs0 There is a bad break after Y135, and I may want to add some sheet constraints for tucking L124-Y135 into place. I don't like *either* model for the next region. I could try A170-T201 N2 W202-S280 MQAU1 I have to continue with MQAU1 to get a connection between the domains, and MQAU1 looks better for a while -L465 MQAU1 S466-R574 C2 K575-P583 MQAU1 V584-D660 C2 R661-V685 MQAU1 Putting this together gives me M1-L121 MQAU1-opt3.gromacs0.repack-nonPC R122-T201 N12-P306/try2-opt3.gromacs0 W202-L465 MQAU1 S466-R574 G302-V685/try2-opt3.gromacs0 K575-P583 MQAU1 V584-D660 G302-V685/try2-opt3.gromacs0 R661-V685 MQAU1 I'll modify try4.costfcn to include MQAU1 sheets and helices as well. Thu Jul 17 18:05:18 PDT 2008 Kevin Karplus I started try4 to attempt to close gaps in chimera-N2-MQAU1-C2. Thu Jul 17 21:24:42 PDT 2008 Kevin Karplus try4-opt3 does not score quite as well as MQAU1 and MQAC1, at least not in the gromacs optimized versions of each, mainly because it still has some bad breaks. I'll do a polishing run on try4 to see if I can best the MQAU1 and MQAC1 runs. It will start from all SAM+undertaker models, but not the MQA models. Fri Jul 18 08:03:28 PDT 2008 Kevin Karplus try5-opt3 scores better but still has some bad breaks. I still don't like any of my models for I173-P263. Perhaps I should optimize that domain separately, adding constraints I173.CA V262.CA 7.44 L174.CA V262.CA 6.64 C175.CA V262.CA 7.09 I173.CA P263.CA 5.41 L174.CA P263.CA 6.74 C175.CA P263.CA 8.55 so that it can be put back in. Fri Jul 18 08:27:45 PDT 2008 Kevin Karplus I started I173-P263/try2 with an inconsistent set of constraints that favors the model from-try5 a little. I also started I173-P263/try3 with the same costfcn, but starting from alignments. Fri Jul 18 08:43:41 PDT 2008 Kevin Karplus Perhaps more to the point, I started I173-P263/try4 with constraints taken from the alignment of that region to 1u04A, starting from alignments. Fri Jul 18 08:46:56 PDT 2008 Kevin Karplus Looking at other parts of the try5 model: I wonder if I am missing a meander for K320-R335. There seems to be one almost forming. 1u04A DOES have an extra strand there, with i,i+8 hbonds. I should add a sheet constraint, but I'm a little confused about which one SheetConstraint P319 M322 K329 A326 Hbond V327 ? The i,i+8 pairing would have L321 with K329, which would require some remodeling of the try5 model, but this looks a bit more like 1u04A: SheetConstraint K320 G323 P330 V327 Hbond L321 L123-R136 is still messed up, with a bad break before E138. I need to look at what happens in that region in 1u04A. Ah---it forms a helix for roughly E125-A133. I should add that as a constraint! HelixConstraint E125 A133 Fri Jul 18 10:31:02 PDT 2008 Kevin Karplus Oops---the do2 and do4 runs were started in the wrong directory (I173-P263/decoys/ instead of I173-P263/), so I've restarted them in the correct directory. Fri Jul 18 11:42:18 PDT 2008 Kevin Karplus The I173-P263/try4 run does not get the sheets I expected. I'll try again as try5, but turn the sheet constraints up and the helix constraints down. Fri Jul 18 13:16:41 PDT 2008 Kevin Karplus I173-P263/try5 also did not pick up the 1u04A alignment Fri Jul 18 13:26:13 PDT 2008 Kevin Karplus No wonder! I had asked for the sheets from model 8 of I173-P263/T0487.undertaker-align.pdb, but that's 1jiwI. I wanted model 5 (1u04A). Let me try again with the RIGHT constraints. Fri Jul 18 14:42:26 PDT 2008 Kevin Karplus oops, forgot to create I173-P263/try6.under, so it didn't run. Starting it again. Fri Jul 18 19:36:39 PDT 2008 Kevin Karplus I173-P263/try6 is based on 2dtrA+1xd5A+2uubT+1fjgT and is still not really getting the sheets I expected. Ah---model5 is from 1u04A, but is too short an alignment to contain the desired sheets! Fri Jul 18 20:01:22 PDT 2008 Kevin Karplus For I173-P263/try7, I'll get constraints from the whole try3 model, the whole alignment to 1u04A and the N12-P306 alignment to 1u04A, with the most weight on the N12-P306 alignment. Fri Jul 18 21:18:01 PDT 2008 Kevin Karplus I173-P263/try7 did not produce a clean model, so I'll do try8, try9, and try10, each with only one of the three sets of constraints, to avoid conflicting constraints. I173-P263/try8 align1 from whole chain I173-P263/try9 align1 from N12-P306/ I173-P263/try10 try3 from whole chain Sat Jul 19 13:56:50 PDT 2008 Kevin Karplus None of I173-P263/try8, I173-P263/try9, I173-P263/try10 did anything very useful. I'll just try polishing from-try3 and stick it back into try5. Tue Jul 22 16:07:29 PDT 2008 Kevin Karplus I173-P263/try11-opt3 came out pretty good, but the gromacs-optimized version scored better, so I'm doing a polishing run to try to close gaps and remove clashes. Then I'll take a I173-P263/try12 model, stick it back into try5, and polish the resulting complete model. Fri Jul 25 04:29:52 PDT 2008 Kevin Karplus I finally did it---I had try12 with "try2" instead of "try12" everywhere and overwrote the existing try2 files in I173-P263/decoys I'll redo try12 as try13, and try to get it right this time. Fri Jul 25 05:38:26 PDT 2008 Kevin Karplus I made a chimera of try5-opt3.gromacs0 and I173-P263/try13-opt3, using L174-L261 from the subdomain model. For try6, I will polish this chimera. Fri Jul 25 11:33:37 PDT 2008 Kevin Karplus try6 is the best model so far. It seems that polishing the subdomain with constraints that would make it easy to reinsert was effective in producing a usable chimera. I'll now do a polishing run starting from just the gromacs-optimized models to try to cut down clashes and breaks. I should then look to see if we have lost any sheets from the templates, and see if fixup is needed. Sat Jul 26 03:45:54 PDT 2008 Kevin Karplus try7-opt3 polishes try6-opt3.gromacs and manages to reduce breaks somewhat without losing anything significant on other cost functions, though I'm a bit worried about the n_ca_c bond angles getting a little too high a cost. rosetta now likes try7-opt3.gromacs0.repack-nonPC best. Some of the top models are a little off on the first strand, which is present in try3 SheetConstraint (T0487)G5 (T0487)L11 (T0487)R315 (T0487)V309 hbond (T0487)T7 1 Perhaps I should copy M1-N12 and P306-I318 from try3-opt3.gromacs0 and G131-G149 from MQAU1-opt3.gromacs0.repack-nonPC Sat Jul 26 13:07:20 PDT 2008 Kevin Karplus try8-opt3, optimized from chimera-try7-try3-MQAU1, now scores the best with the try8 costfcn, and try8-opt3.gromacs0.repack-nonPC has the lowest Rosetta energy. I like the fixes that were made in try8, and I'd like to add another strand that has just barely been lost: SheetConstraint V9 N12 P583 R580 hbond R580 I'll do a try9 optimization with this constraint added, starting from all gromacs-optimized models (like try7) Sat Jul 26 16:19:32 PDT 2008 Kevin Karplus try9 increased breaks in order to get the extra sheet constraints. What it actually did was to optimize try7-opt3 instead of try8-opt3. I'll try again in try10, starting only from the try8 models and not the try7 ones, and increasing the break cost a bit more. Sat Jul 26 19:07:01 PDT 2008 Kevin Karplus try10 does improve on try8-opt3.gromacs0, but try10-opt3.repack-nonPC scores best with the try10 costfcn. Rosetta likes best try10-opt3.gromacs0.repack-nonPC. I'm curious where the I173-P263 domain comes from, so I'll submit it to VAST. Your VAST Search job was submitted at 07/26/2008 22:25:23(EDT). Request ID: 703700384036415691 Two long hits: PDB C D Ali. Len. SCORE P-VAL RMSD %Id Description 1R4K A 81 9.9 10e-4.9 1.7 8.6 Solution Structure Of The Drosophila Argonaute 1 Paz Domain 1YVU A 4 53 7.9 0.0025 1.9 11.3 Crystal Structure Of A. Aeolicus Argonaute Sat Jul 26 19:33:15 PDT 2008 Kevin Karplus I think I've reached the point where further optimization is not going to improve things. I'll submit ReadConformPDB T0487.try10-opt3.gromacs0.repack-nonPC.pdb # < # try8-op3.gromacs0 < chimera-try7-try3-MQAU1 ReadConformPDB T0487.try9-opt3.pdb # < try7-opt3.gromacs0 < try6-opt3.gromacs0 < chimera-try5-try13 ReadConformPDB T0487.try5-opt3.gromacs0.pdb # < try4-opt3.gromacs0.repack-nonPC < chimera-N2-MQAU1-C2 ReadConformPDB T0487.MQAU1-opt3.gromacs0.repack-nonPC.pdb # < GS-KudlatyPred_TS2 ReadConformPDB T0487.try3-opt3.gromacs0.pdb # < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA) Sat Jul 26 19:46:21 PDT 2008 Kevin Karplus Submitting with comment T0487 had obvious homology to 1yvuA, 1u04A, and for the separate domains to 1w9hA, 1r4kA, 1si2A, 1vynA, 1t2rA, and 1r6zA. I had the most trouble with the subdomain I173-P263, which did not appear to have homologs. The model I ended up with appears to come primarily from 1r4kA, though it is also similar to 1yvuA. With such a large model, the normal-length optimization runs did not get as much gap closure and clash removal as they would have on shorter proteins. I tried doing separate domains a little bit (most successfuly with N12-P306 and G302-V685), and pasting the pieces back together. Some additional cut-and-paste was needed to get the initial strand properly in place. Although my MQA runs like Zhang-Server_TS3 best, my initial meta-server run ended up picking GS-KudlatyPred_TS2 as its primary template. I submit this meta-server run as model 4, and included bits and pieces of it when gluing together some of my subdomain predictions. The final model is not highly polished, but further optimization is not likely to make huge improvements, and I'm pretty burned out by the end of CASP season. Model 1 T0487.try10-opt3.gromacs0.repack-nonPC.pdb # < # try8-op3.gromacs0 < chimera-try7-try3-MQAU1 chimera-try7-try3-MQAU1: mostly T0487.try7-opt3.pdb M1-N12 and P306-I318 from try3-opt3.gromacs0 G131-G149 from MQAU1-opt3.gromacs0.repack-nonPC try7-opt3 < try6-opt3.gromacs0 < chimera-try5-try13 try3-opt3 < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA) 2 T0487.try9-opt3.pdb # < try7-opt3.gromacs0 < try6-opt3.gromacs0 < chimera-try5-try13 chimera-try5-try13: mostly T0487.try5-opt3.gromacs0.pdb L174-L261 from I173-P263/try13-opt3 < I173-P263/try11-opt3.gromacs0 < try3-opt3 try5-opt3 < try4-opt3.gromacs0.repack-nonPC < chimera-N2-MQAU1-C2 try3-opt3 < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA) 3 T0487.try5-opt3.gromacs0.pdb # < try4-opt3.gromacs0.repack-nonPC < chimera-N2-MQAU1-C2 chimera-N2-MQAU1-C2: M1-L121 MQAU1-opt3.gromacs0.repack-nonPC < GS-KudlatyPred_TS2 R122-T201 N12-P306/try2-opt3.gromacs0 < N12-P306/try1-opt3.repack-nonPC < align(1u04A?) W202-L465 MQAU1-opt3.gromacs0.repack-nonPC < GS-KudlatyPred_TS2 S466-R574 G302-V685/try2-opt3.gromacs0 < G302-V685/try1-opt3 < align(1w9hA) K575-P583 MQAU1-opt3.gromacs0.repack-nonPC < GS-KudlatyPred_TS2 V584-D660 G302-V685/try2-opt3.gromacs0 < G302-V685/try1-opt3 < align(1w9hA) R661-V685 MQAU1-opt3.gromacs0.repack-nonPC < GS-KudlatyPred_TS2 4 T0487.MQAU1-opt3.gromacs0.repack-nonPC.pdb # < GS-KudlatyPred_TS2 5 T0487.try3-opt3.gromacs0.pdb # < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA)