Wed Jul 14 14:48:43 PDT 2004 T0242 DUE 23 Aug Thu Jul 15 09:50:15 PDT 2004 Kevin Karplus try1-opt2 gets 2 parallel strads and almost puts in a third, but skips an antiparallel strand and a parallel one. Strands appear to be s1: K3-A9 s2: T20-I24 s3: R70-N76 s4: D79-Q88 s5: I110-W116 s4 || s5 seems reasonably formed in try1-opt2: SheetConstraint A81 I86 I110 V115 hbond I83 but the exposed face of the sheet gets the buried residues (ok if dimer) I think that s3 ^V s4 looks sensible SheetConstraint R70 N76 M85 D79 hbond L82 I'm not sure yet where s1 and s2 belong. The alignment does not have much diversity, so the 2ry and mutual-info predictions are weak (NO significant MI pairs). Possible disulfide C33-C62, but seems a bit unlikley, since not conserved in one of the putative homologs. (Could try optimizing with and without an ssbond.) Wed Aug 11 11:59:38 PDT 2004 Sol Katzman Try2 did not form s3 ^v s4. But the hbond_geom weights were weak (from the earlier version of undertaker) so I set them to the new standard values. Also increased constraints a little (10 -> 20). Added an s1 ^v s2 constraint. This will be try3. Wed Aug 11 21:24:37 PDT 2004 Sol Katzman Oops. I meant s1 || s2, and that is actually what I put in try3.costfcn and try3 did manage to make a 3-residue version of this parallel sheet. Try3 also bonded a little of s2 parallel to s5. On the other hand it failed to make any improvement in the desired s3 ^v s4 connection. And so far, this model is not looking much like any protein. Maybe I will look at some of the server models. Sun Aug 15 18:05:48 PDT 2004 Sol Katzman I looked at models from ROBETTA, FUGMOD_SERVER, PROTINFO-AB which were the source of some of our tries for the target T0243. Aside: T0242 is from the same STIV virus as T0243. In fact it is ORF B116 from that virus. See the T0243/README for more info on this archeon virus found in geysers in Yellowstone park. The ROBETTA models fall into 3 classes: a) ROBETTA models 1,3,4,7,9,10 formed a 2-strand sheet s3 ^v s4 b) ROBETTA models 2,6,8 do not form any real sheets c) ROBETTA model 5 forms part of a 3-strand sheet s3 ^v s4 || s5 The FUGMOD_SERVER models fall into these classes: a) FUGMOD_SERVER models 1,4 do not look very useful to us b) FUGMOD_SERVER models 2,3 are all-helix models we may want to use c) FUGMOD_SERVER model 5 is very interesting. It forms the s3 ^v s4 sheet but makes a sandwich with another sheet formed by s5 ^v s2r, where we define s2r as L45-H51, which is weakly predicted as a strand by t04. t04 also weakly predicts E38-L40 as anti-parallel strand (call it s2q) with the DG turn at D42-G43. The PROTINFO-AB models fall into these classes: a) PROTINFO-AB models 1,3,4 are all-helix, with some long loops b) PROTINFO-AB models 2,5 are all-helix, with knots To review our models so far we have: try1,try2,try3 have s4 || s5 try3 has a some of s1 || s2, and a little s2 || s5 but we do not have s3 ^v s4 which is the strongest t2k and t04 prediction! For try4, read in t04.many.frag (although it seems like the t04 and t2k alignments are using the same sequences, so I doubt that the fragments will be much different). For try4 constraints, omit the HelixConstraint E41-I49 (possible strand s2r, noted above) and keep the 3 existing StrandConstraints: s1 || s2 s4 || s5 s3 ^v s4 # modified to be like robetta model 5 Increase the hbond_geom_beta* weight, increase constraints weight. For try4 remove TryAllAlign, instead read in robetta models 1,3,4,5,7,9,10 and FUGMOD_SERVER model 5. Do NOT read in the earlier tries. (Before the run, try3 scores best on the try4 costfcn, followed by all the robetta models.) For try4h, let's try an all-helix model. Remove the SheetConstraints, and read in FUGMOD_SERVER models 2,3 and PROTINFO-AB models 1,3,4. Mon Aug 16 10:07:21 PDT 2004 Sol Katzman Try4 successfully created the sheet s3 ^v s4 || s5 (like Robetta model5). Try4 does not have s1 || s2 (unsurprising, since we did not read in any of the alignments or previous tries.) Rather than just copying robetta, we should probably attempt to get s1 || s2 into this model, since that is a fairly strong prediction from t2k and t04, which is hopefully our added value. At this point the model is very extended, with lots of holes, but we can work on that later. For try5, I will read in some of the earlier tries, and increase crossover. (3->5 in opt1, 1->3 in opt2) in InitMethodProbs. Slightly increase strand constraint on s1 and s2 (1.0 -> 2.5). Otherwise, constraints are unchanged from try4. The result of try4h, the all-helix model is a rather large protein with lots of holes. For try5h, read in the try4h models and also increase phobic_fit (2 -> 5). Incidentally, the biology here is that the closest templates, such as 1ixsB, are listed as Holliday junction helicases. Since this is a virus protein, and I thought that viruses rely on the host for all that type of machinery, I am curious as to whether this target has an unrelated function. Mon Aug 16 16:16:06 PDT 2004 Sol Katzman It was suggested at our weekly meeting to try different topologies for the strands. Now the try5 result had s1 || s2 slightly s4 || s5 good s3 ^v s4 slightly For try6, I will attempt to get s3 between s4 and s5, so that the strand is fully antiparallel: s1 || s2 no change s5 ^v s3 ^v s4 I will also increase the weight on constraints (->35) and increase the hbond_geom_beta* weights. Since none of the tries had this arrangment of sheets, go back to the TryAllAlign, but also include the good robetta models EXCEPT for model 5. (i.e. 3,4,7,9,10) Thu Aug 19 16:00:41 PDT 2004 Sol Katzman Try6 blew apart the strands and the s3,s4,s5 sheets. I think I will abandon that approach. Instead, go back to the try5 constraints and attempt to get s1 || s2 and s3 ^v s4 to work better. Get the s3 ^v s4 constraint from FUGMOD_TS5.sheets Increase weight of constraints and hbond_geom_beta* as in try6. Also, increase wet and dry weights per the more recent target defaults. That will be try7. Fri Aug 20 09:15:22 PDT 2004 Sol Katzman Try7 was not very good. The s3 ^v s4 connections bonded only one pair of residues. The rest of the s3 strand is out of position. These attempts really do not seem to be going anywhere. I will see if just reading in all models and increasing dry weights will help. Aha! What a dope! I left out the 'infileprefix decoys/' before attempting to read the existing models in try4,try4h,try5,try5h,try6,try7. So all of these tries were starting with just random conformations. No wonder nothing converged. For try8, use the try7 costfcn, but this time actually read in the desired robetta and FUGMOD models, as well as try3,try4,try5. Also do a try6h (all-helix) that uses the try5h costfcn but actually read in the desired FUGMOD and PROTINFO models as well as try4h,try5h. Fri Aug 20 15:05:56 PDT 2004 Kevin Karplus The Holliday junction helicase is in a scop class that containes an extended AAA-ATPase domain followed by a winged helix DNA-binding domain. Virus proteins that bind and manipulate DNA and RNA are not surprising---the homology is not close enough to determine exactly what the protein does with the nucleic acid. try8-opt2 makes an OK parallel sheet, but the helices do not pack well against the sheet and the whole sheet is a bit questionable, since it is floating in water with its hydrophobics hanging out. try6h-opt2 is a little more compact, but exposes a lot of hydrophobics and does not form any of the predicted strands. I set up an unconstrained.costfcn, but the top 3 scorers are all incomplete models. It is rather dangerous to include incomplete models in optimization runs---that used to crash undertaker. I think that the incomplete models are now being ignored, so including the FUGMOD and PROTINFO models in an optimization run really has no effect. The PROTINFO-AB_TS1 model does suggest that s3 ^v s4 and s2 ^v s4, though s2 may line up later on the s4 strand. The best scoring full models with the unconstrained costfcn are try6h-opt2 try5h-opt2 try2-opt2 try1-opt2 robetta-model5 robetta-model5 looks pretty good, especially if we swung s1 ^v s2 around, and made s1 ^v s5. (One could also imagine a sheet topology with s3 ^v s4 ^v s2 ^v s5 ^v s1---in fact, that looks very appealing and would not require much change to the robetta-model5 structure (separating s4 from s5, and moving s2 and s1 to the other side of the helices). Fri Aug 20 18:56:29 PDT 2004 Sol Katzman Attempt to implement Kevin's last suggestion: s3 ^v s4 ^v s2 ^v s5 ^v s1 s3 70 71 72 73 74 75 | | | hbond V71 s4 85 84 83 82 81 80 79 78 | | hbond T21 s2 20 21 22 23 24 | | | hbond T20 s5 114 113 112 111 110 | | hbond V4 s1 4 5 6 7 This will be try9. An alternative is to swap s1 and s2 to get: s3 ^v s4 ^v s1 ^v s5 ^v s2 This will be try10. Sat Aug 21 08:21:46 PDT 2004 Kevin Karplus It is not clear to me how the starting models were chosen for try9 and try10. At least robetta-model5 was in both. Neither try9 nor try10 scores well in unconstrained costfcn, though each scores best in its own costfcn, and second best in the other. I don't particularly like the s3 ^v s4 pairing used in try9.costfcn---let's use the one from robetta model 5 instead: SheetConstraint R70 N76 M85 D79 hbond M75 # s3 ^v s4 How about s3> nRVEVKMne s4< IMIILAEdg s2> tTITId s5< wVEYFSIk s1> mgKVFLTNa SheetConstraint R70 N76 M85 D79 hbond M75 # s3 ^v s4 SheetConstraint E80 I84 D25 T21 hbond A81 # s4 ^v s2 SheetConstraint T20 D25 E114 K109 hbond T23 # s2 ^v s5 SheetConstraint E114 K109 K3 N8 hbond I110 # s5 ^v s1 I'll put these in try11, and do a run starting from the alignments. I'll also use them for try12 and do a run starting from existing models. Before doing them though, I'll add some more templates to MANUAL_TOP_HITS and make extra_alignments and all-align.* Sat Aug 21 12:41:16 PDT 2004 Kevin Karplus try12 almost forms the desired sheet, but strand s2 is rather messed up---the burial on try12 looks pretty good. try11 is also pretty good---almost there on the sheet (both strands 2 and 5 are a bit messed up). It might be worthwhile to try doing an optimization from models again, hoping for some crossover between try11 and try12. If Sol is around to work on the model, he might want to try a different approach with different strand pairings, and see what else we can come up with. Sat Aug 21 16:01:14 PDT 2004 Kevin Karplus try13 is a polishing of try11, mainly reducing clashes. If I want to polish try12, I should probably increase the constraint weight to 20 or 30, to pull try12 to the front of the list. I made a tweaked costfcn for run try14, to try to improve on try12. Sat Aug 21 20:03:58 PDT 2004 Kevin Karplus Try14 does seem to improve on try12, though strand s2 is messed up. Maybe I should try running s2 parallel to s5 instead of antiparallel. Wait a sec! the cost function in try14 doesn't match the strands that are forming. Let's try with sheet constraints that match what is forming, and add s5 ^v s2 on the end. Maybe that will form a betetr sheet. I've set up these constraints as try15, and will do another run. Sun Aug 22 07:13:51 PDT 2004 Kevin Karplus try15-opt2 has some bad gaps. I'll somewhat arbitrarily try shifting strand s1 relative to s4 and flipping s1 and s5 over for try16. Sun Aug 22 09:23:02 PDT 2004 Kevin Karplus try16-opt2 still comes closer to meeting the try15 constraints than the try16 ones. For try17, I'll leave the antiparallel sheet alone (as in try15) but try making s2 || s5 instead of antiparallel. Sun Aug 22 10:33:19 PDT 2004 Kevin Karplus For try18, I'll try a mix of try16 and try17 costfunctions, with the slipped strands but with s2 || s5. I'll start it from alignments, rather than from existing models, since I don't seem to be able to budge the existing models much. For try19, I'll try optimizing the try16 costfcn from alignments. Sun Aug 22 11:34:10 PDT 2004 Kevin Karplus try17-opt2 looks ok---not much change. try18-opt1 looks intriguing---I'm interested in seeing what try18-opt2 looks like. try19-opt1 looks like a failure. Sun Aug 22 16:27:34 PDT 2004 Kevin Karplus try19-opt2 looks terrible---there are some strand pairings but everything is scattered. try18-opt2 has a nice pairing for s3 ^v s4 and the helix before s3. s1 bumps into s5 in an ugly way, but could be converted to s1 ^v s5 || s2 fairly easily. This 3-strand sheet would not connect well with s3 ^v s4. In try17-opt2, s2 is still not lining up with s5. I think the try17 series has gotten as good as it is going to get by polishing. From learithe@soe.ucsc.edu Sun Aug 22 12:13:16 2004 Date: Sun, 22 Aug 2004 12:13:12 -0700 (PDT) From: Jenny Draper Regarding T0242, try17, it looks like s2 and s4 are close to pairing. should s4 slip down one, bringing V4 and I86 closer? -Jenny From learithe@soe.ucsc.edu Sun Aug 22 12:17:38 2004 Date: Sun, 22 Aug 2004 12:17:34 -0700 (PDT) From: Jenny Draper I meant _s1_ and s4. ============================================================ Sun Aug 22 16:38:54 PDT 2004 Kevin Karplus It might be possible to align V4 with I86 instead of I84. try16 attempted that, but also had hbond changes that may have interfered with cleaning up try17. Currently, try17-opt2 scores best with an unconstrained costfcn, followed by try13-opt2, try16-opt2, try6h-opt2. The try19 costfcn also likes try17-opt2 best, as does the try18 costfcn. I suppose I should try just sliding the try17 constraints over by 2 on the s1 ^v s4 interface. Hmm, but at the other end of the strand, we currently have T7 aligned to E80---the phase is not consistent if we have V4-I86 and T7-E80 or T7-L82. I'll try the T7-L82 pairing. Sun Aug 22 19:31:12 PDT 2004 Kevin Karplus Well, I'm running out of time so I need to submit some stuff. try20-opt2 best unconstrained try18-opt2 alternative sheet possibility try6h-opt2 alternative mostly helical possibility that scores ok try1-opt2 full auto try1-opt2.repack-nonPC best rosetta energy There is no point in showing templates for this target, as they are all trash. Mon Aug 23 11:21:19 PDT 2004 Kevin Karplus Changing to try20-opt2 best unconstrained try18-opt2 alternative sheet possibility try6h-opt2 alternative mostly helical possibility that scores ok try8-opt2 5-strand sheet try1-opt2 full auto