Wed Jun 9 10:20:51 PDT 2004 T0202 DUE 6 Aug 2004 Wed Jun 9 18:04:43 PDT 2004 Kevin Karplus This looks like a fold-recognition target with 3pfk (c.89.1.1) or 2tysB (c.79.1.1) as the best match. It may be a new fold, but I'm hoping for an existing one, as there are a LOT of beta strands. try1 is astonishingly ugly---only one strand-helix-strand has formed Hbond I43.N P64.O # 3.05223 Hbond I43.O F66.N # 2.86163 Hbond S45.N F66.O # 3.00959 I'll put these into try2, along with the strand constraints that were accidentally omitted. Hmmm, the break penalties may be a bit too high, making it difficult to move pieces around enough. The beta sheets for this one may need to be constructed by hand. Wed Jun 9 22:40:10 PDT 2004 Kevin Karplus try2 doesn't look much better than try1. This one will probably need hand assembly. Thu Jun 10 09:43:00 PDT 2004 Kevin Karplus The only thing that is coming from the initial alignments is the strand-helix-strand of I43-G47 paired with P64-I68. We'll have to do the rest of the pairing manually! We can get some of the hairpins, I think. Fri Jun 11 15:14:26 PDT 2004 Kevin Karplus Try4-opt2 doesn't score quite as well as try3-opt2 (even with the try4.costfcn), but does have a number of hairpins. Perhaps we should try doing a crossover optimization from the models so far, and see if anything can be improved. I'll lower break and clash penalties a bit. This will be try5. Sat Jun 12 08:34:48 PDT 2004 Kevin Karplus In try5-opt2, a beta sheet is beginning to form. The N-terminal hairpin belongs in it somewhere, but I'm not sure where. K196-A201 (predicted as strand) has curled up into a helix, but it probably belongs antiparallel to S213-D217 and to Y189-V191: 188> pYVVsm 201< eAIVEIKre 210> gqKSVDFd Sat Jun 12 21:07:13 PDT 2004 Kevin Karplus I have some bits and pieces of beta sheet now in try6, but still a lot more to form---this looks like it will be a slow process of guessing sheet topologies, adding constraints, and seeing which work out. We could run a few in parallel, but I'm having enough trouble working on multiple targets without also working on several conjectures for each. Fri Jun 25 18:30:27 PDT 2004 Kevin Karplus Ligand news from CASP: T0202 -- NADPH Thu Jul 15 11:44:40 PDT 2004 Kevin Karplus It looks the mutual-information predictions are pretty strong for this target, so those constraints may be worth using to increase the chance of folding things right. I modified superimpose-best.under to create helices and sheets files for the models (undertaker, alignment, or robetta). It looked to me that robetta-model1 has a bad knot, so I'm scoring them all with the "knot" cost function also. It seems that robetta models 1 and 5 have knots, as do the first two models from alignments (which is probably an artifact of missing residues). Thu Jul 15 16:54:23 PDT 2004 Kevin Karplus try7 has some fairly good sheet fragments. I don't like it making E28-F30 into a strand though. Fri Jul 16 15:02:16 PDT 2004 Kevin Karplus Maybe I need to create a "strands" rasmol script to label the probable strands: define s1 2-7 define s2 12-16 define s3 42-45 define s4 65-69 define s5 97-99 define s6 102-107 define s7 113-119 define s8 127-134 define s9 137-143 define s10 146-150 define s11 173-178 define s12 189-191 define s13 196-201 define s14 204-208 define s15 211-215 define s16 219-224 define s17 230-232 define beta s1 or s2 or s3 or s4 or s5 or s6 or s7 or s8 or s9 or s10 or s11 or s12 or s13 or s14 or s15 or s16 define h1 21-27 define h2 50-58 define h3 81-91 define h4 158-163 define hall h1 or h2 or h3 or h4 Fri Jul 16 16:59:41 PDT 2004 Kevin Karplus In addition to the "easy" sheet constraints we've guessed already, and some weak constraints from George's predictions, I'm going to put in some strong constraints to cluster the 3 conserved ASP residues (D49, D145, and D217) assuming that they form a catalytic triad. Fri Jul 16 19:48:24 PDT 2004 Kevin Karplus try8 scores well under the new cost function, but the aspartic acids have not come together yet. (I have a rasmol script highlighting the triad, called "triad".) It looks like s15 should be parallel to s3 and s4 and antiparallel to s10. S3 and S4 are making a nice parallel connection now, so the question is one of ordering, with 3!=6 orders: A s3 || s4 || s15 ^v s10 B s3 || s4 ^v s10 ^v s15 C s15 || s3 || s4 ^v s10 D s15 ^v s10 ^v s3 || s4 E s10 ^v s3 || s4 || s15 F s10 ^v s15 || s3 || s4 One of the neural-net constraints (P165-I175) weakly implies that s10 || s11---a constraint we had already put in. If that is correct, then we can rule out orders B and D. We also have s15^vs16, so we can rule out A and F, leaving only C and E: C s17 ^v s16 ^v s15 || s3 || s4 ^v s10 || s11 E s11 || s10 ^v s3 || s4 || s15 ^v s16 ^v s17 Another constraint L76-P165, implies that the helix between s4 and s5 is near the helix between s10 and s11, so I favor the C ordering. One problem with this conjecture---s15 is almost all hydrophobic, and s3 is almost all polar. I'll try it anyway for try9, with increased weight on the triad constraints also. Fri Jul 16 22:14:32 PDT 2004 Kevin Karplus try10 is the same as try9, but with the fixed version of undertaker that gets the hbonds right (I hope) in the SheetConstraints. try10 should be able to do a better job, since it has more consistent constraints internally. Sat Jul 17 09:57:14 PDT 2004 Kevin Karplus try9-opt2 scores a bit better than try10-opt2, but neither one is really great. I'll have to spend some more time looking at the pieces and seeing whether I can fit them together. Sun Jul 18 12:46:55 PDT 2004 Kevin Karplus I didn't get the time or energy yesterday to do anything with this target. There are very few sheets actually formed: try8-opt2 SheetConstraint (T0202)F42 (T0202)V46 (T0202)P64 (T0202)I68 hbond (T0202)I43 SheetConstraint (T0202)D129 (T0202)V134 (T0202)I142 (T0202)V137 hbond (T0202)L132 SheetConstraint (T0202)V214 (T0202)D217 (T0202)E223 (T0202)I220 hbond (T0202)D217 try9-opt2 SheetConstraint (T0202)I43 (T0202)V46 (T0202)I65 (T0202)I68 hbond (T0202)I43 SheetConstraint (T0202)I128 (T0202)V130 (T0202)I142 (T0202)D140 hbond (T0202)V130 SheetConstraint (T0202)A131 (T0202)V134 (T0202)D140 (T0202)V137 hbond (T0202)V134 try10-opt2 SheetConstraint (T0202)I43 (T0202)V46 (T0202)I65 (T0202)I68 hbond (T0202)I43 SheetConstraint (T0202)R133 (T0202)V134 (T0202)E138 (T0202)V137 hbond (T0202)V134 The 1pfkA structure used as a template has two sheets: The terminal sheet is s3 ^v s2 || s1 || s4 || s5 || s10 ^v s11 The middle sheet is s7 || s8 || s6 || s9. (Number of strands is in order along chain of 1pfkA, and does not necessarily correspond to numbering of target strands.) Nowhere in the predicted secondary structure do we have the strand/helix alternation to make a 4-strand parallel sheet, needed for matching EITHER sheet of 1pfkA. I think this is a poor match for fold recognition. The second best domain hit was for 2tysB, which also has a 4-strand parallel sheet in the middle but with order CBAD, rather than BCAD. The other sheet is complicated with A ^v B || F || C || D || E, with the domain having the 4-strand sheet inserted between B and C. Again, I don'thave anywhere near enough helices for this fold to be useful, For try11, I'm giving up on the templates to a large extent, and just putting together antiparallel constraints based on adjacency in the sequence. I'll try adding a couple of weaker sheet constraints for other possibilities. try11 will read in the alignments, but not do TryAllAlign---using fragment insertion rather than starting from a bad initial alignment. Sun Jul 18 21:43:13 PDT 2004 Kevin Karplus try11 is an ugly mess with many breaks. Only s3 || s4 formed nicely. Still, it scores much better with the try11.costfcn than anything else we've looked at, so maybe polishing it with bigger break penalties might produce something feasible. Mon Jul 19 08:26:41 PDT 2004 Kevin Karplus try12 is taking an extremely long time on whinny. After I get out the opt1 version, I'll kill the job, look at the result, and start a much shorter run. Tue Jul 20 13:57:58 PDT 2004 Kevin Karplus I got busy and let it finish. The results for try12-opt2 are still very ugly, with lots of breaks. I obviously have some (or all) of the sheet constraints wrong. Still, with an unconstrained cost function, try12 scores the best of any of our models. I'm wondering if the antiparallel action is really happening in some sort of sandwich structure, with the turns moving to the other sheet, rather than back to the same sheet. Wed Jul 21 09:16:06 PDT 2004 Kevin Karplus I'm going to try removing a lot of sheet constraints, putting in just the ones for strongly predicted turns at P109-D110 S6 ^v S7 D135-G136 S8 ^v S9 M193-E194 S12 ^v S13 D209-D210 S14 ^v S15 D217-G218 S15 ^v S16 I'll also include all the strongly predicted (P>0.5) contacts from George's "280.rr" predictions. Before I start the run, I'll need to create a T0202.t04.many.frag fragment file---especially since t2k and t04 disagree about the secondary structure predictions in places. Wed Jul 21 11:05:53 PDT 2004 Kevin Karplus Try13 seems to be mainly polishing try9, which may not be the most desirable thing to do. It might be worth starting from the alignments with the same cost function. I'll set that up as try14, but I'll change the constraint weights a bit to put more weight on the helix and strand constraints. Wed Jul 21 11:07:44 PDT 2004 ggshack Just to take a look, I am starting from try13, switching to bonus_constraint and boosting break. Starting TRY15 on caw. Wed Jul 21 17:49:13 PDT 2004 Kevin Karplus Someone rebooted crow at 13:51, killing my try14 job. Foo! I wish whoever did it had at least sent me a message saying they were going to kill my job! Since try14 produced nothing before it was killed, I'll have to restart it somewhere. George, just sent me a note (AGAIN FORGETTING TO PUT ANYTHING IN THE README FILE): Subject: T0202.try15-opt1 is available Date: Wed, 21 Jul 2004 17:10:41 -0700 This is the one I ran with bonus_constraint. I thought it would put the RR constraints closer but it appears it didn't, not as much as I thought. On the other hand, some sheets appeared (but not as phobic as they should be...). - George Thu Jul 22 14:53:28 PDT 2004 Kevin Karplus try14 scores best with unconstrained.costfcn It looks really terrible to me, with many strands wound up into helices. With George's try15 costfcn, try15 scores best. Some of the hairpins look decent, but the overall sheet topology still needs work. Still it's better than many moels we've created for this target. With the try14 costfcn, the best-scoring are try13 and try15. try13 again has some decent bits of super-secondary structure in an overall poor model. I'd be hard pressed to choose between try13 and try15--they both look mostly wrong. try13 brings the rr constraints a bit closer together, but doesn't seem to really match them. Sat Jul 24 07:27:33 PDT 2004 Kevin Karplus I should try SOMETHING on this target again, though I don't feel we've made a lot of progress. The unconstrained cost function likes try14 best, then try13, try15, try12, try5, try9, try6, try8, try2, try1, ... I've not been able to get agreement from the t2k and t04 multiple alignments about what the secondary structures at the N- and C-terminal ends are. Most of the conserved residues agree, except for a conserved D near the end, which t2k thinks is D217, but t04 thinks is D209. try14-opt2.sheets: SheetConstraint I43 V46 I65 I68 hbond S45 SheetConstraint F100 V103 C144 F147 hbond R102 SheetConstraint P101 V103 L171 C173 hbond V103 SheetConstraint I117 L120 F147 A150 hbond I117 SheetConstraint K212 D217 S225 I220 hbond S213 try13-opt2.sheets: SheetConstraint I43 V46 I65 I68 hbond I43 SheetConstraint I128 V130 I142 D140 hbond V130 SheetConstraint A131 V134 D140 V137 hbond L132 try15-opt2.sheets: SheetConstraint V6 K8 K14 H12 hbond V6 SheetConstraint F42 G47 P64 N69 hbond I43 SheetConstraint C105 M108 L114 V111 hbond S106 SheetConstraint D129 V134 I142 V137 hbond L132 SheetConstraint K212 D217 S225 I220 hbond D215 SheetConstraint I222 K224 V44 V46 hbond E223 try12-opt2.sheets: SheetConstraint V44 V46 F66 I68 hbond S45 I'll try putting these into a cost fcn, with some attempt to reduce the conflicts, and add in lots of George's RR constraints (swtiching to "bonus" constraints below 0.6). Sat Jul 24 16:35:22 PDT 2004 Kevin Karplus try16 is full of breaks, but reasonably compact. It made only a slight improvement on try13 in the optimization, but it is the new best with the unconstrained costfcn. Maybe I should try a polishing run with an unconstrained cost function (or perhaps helix and strand constraints, but no others) and give up on this target. We can submit whatever scores adequately with one of the cost functions, and whatever Rosetta hates least, but I don't think I have the energy or the inspiration to come up with a decent model for this one. Sat Jul 24 21:12:41 PDT 2004 Kevin Karplus try17-opt2 looks very familiar---it is probably just a minor polishing of try15 (In fact, the try15 costfcn score try17-opt2 first). unconstrained cost function now orders try16, try14, try17, try13, try15, try12, try5, try9, ... Rosetta likes best the try14-opt2.repack-nonPC, but perhaps I should say "hates least", since all the energies are enormous. (Order currently is try14, try17, try15, try16, try13, ...) Sun Jul 25 22:04:01 PDT 2004 Kevin Karplus unconstrained now orders them try16, try14, try17, try18, try13, try15, ... Rosetta orders them try14, try17, try15, try18, try16 strands.costfcn orders them try17, try15, try14, try16, try18, ... I think we should submit try17-opt2 try14-opt2.repack-nonPC try16 try1 best model from alignment Mon Jul 26 17:10:55 PDT 2004 Kevin Karplus The superimpose-best script shows ReadConformPDB T0202.try18-opt2.pdb ReadConformPDB T0202.try14-opt2.pdb ReadConformPDB T0202.try17-opt2.pdb ReadConformPDB T0202.try1-opt2.pdb InFilePrefix ReadConformPDB T0202.t2k.undertaker-align.pdb model 1 was this a result of discussion or did I just forget? The README is more recent, so I'll go with it (also try17 looks better to me than try18). Sun Sep 19 10:21:33 PDT 2004 Kevin Karplus I put REAL_PDB:=1suwA into the Makefile and did a whole-chain rmsd evaluation. The order for our submitted models and the robetta models is model5, robetta2, robetta5, model4, model3, robetta1, robetta4, model1, model3, model2 The model5 rmsd is artificially good, because the model is incomplete. So this appears to be a target that robetta beat us on, with both robetta2 and robetta5 better than our models, and the fully automatic model4 beating our hand-improved models. There are better models than any we submitted: try6-opt2, and try11-opt2 would both have beaten robetta5. Wed Sep 22 10:37:49 PDT 2004 Kevin Karplus The model5 number was bogus. Using GDT score to evaluate the models we get robetta1 20.48% robetta5 19.68% robetta2 18.78% robetta4 15.96% robetta3 14.66% try10-opt1 10.64% our best model5 10.14% model1 9.44% model2 8.63% model3 8.23% model4 8.13% These numbers are terrible. Fri Sep 24 13:35:46 PDT 2004 Kevin Karplus Switching to smooth GDT, we get terrible results still: name length missing_atoms rmsd rmsd_ca GDT smooth_GDT robetta-model1.pdb.gz 249 0.0000 23.1629 22.4953 -20.5823 -19.7968 robetta-model5.pdb.gz 249 0.0000 22.0141 20.9691 -20.3815 -18.9094 robetta-model2.pdb.gz 249 0.0000 19.9898 19.1176 -18.7751 -18.2417 robetta-model4.pdb.gz 249 0.0000 23.1815 22.6266 -16.0643 -15.4171 robetta-model3.pdb.gz 249 0.0000 23.5823 22.5253 -14.7590 -14.4397 T0202.try3-opt2.pdb.gz 249 0.0000 24.6744 23.8962 -10.2410 -10.1982 model1.ts-submitted 249 0.0000 23.3773 22.6291 -9.4378 -9.5230 model5.ts-submitted 249 1509 11.0238 9.9730 -9.2369 -8.9004 model2.ts-submitted 249 0.0000 23.6508 23.0157 -8.6345 -8.4842 model4.ts-submitted 249 0.0000 22.3748 21.9120 -8.1325 -8.1672 model3.ts-submitted 249 0.0000 23.1261 22.4288 -8.2329 -8.1612