13 June 2002 Kevin Karplus T0135 seems to be an alpha/beta protein, but we don't have a close enough homolog to get a good alignment. The try1 run has not kept the sheet together, and has wound one of the strands into a helix. 25 June 2002 Kevin Karplus The try2 run, using the new alignments and fragments from the STR HMM, is not doing any better---none of the four strands is paired and the C-terminal strand is wound into a helix. There ARE some sheets in the initial alignments---perhaps things will go better once H-bond scoring is added. In the meantime, we could try guessing the hydrogen bonding from the alignments, and adding constraints. 11 July 2002 Kevin Karplus I'm remaking fragments with the new version of fragfinder, in the hopes of getting the beta sheet to form by magic (unlikely without H-bond cost functions). The STR prediction has 4 strands. The first predicted to be antiparallel or mixed, the second to be anti-parallel edge, the third to be parallel (center or edge), the last to be mixed. We also have a long helix between strands 1 and 2, so they should probably be oriented the same way. What topologies are consistent with this prediction? I don't see any. The best I could do are ^v^^ 2143 matches STR but has problem with helix between 1 and 2 ^^v^ 3142 doesn't match for strand 4. Strand 2 could also be a parallel edge (almost as good as antiparallel) which yields the following topology: ^^vv 2143 So perhaps we could try adding some constraints. The try3 run gets a new best score (no constraints yet), but still does not form a beta sheet. 18 July 2002 Yael/Jenny Working on a try4 run, using the ^^vv 2143 topology as a constraint. 18 July 2002 Yael/Jenny Discovered bug in constraint definitions which caused improper orientation of strands (all antiparallel to each other!). Will run again as try5, with the proper definitions. 20 July 2002 Kevin Karplus Although try6-opt is the best-scoring, it looks terrible, with the sheet not forming and one strand coiling up into a helix. The try4 run looks much better, aside from one bad break. The topology for try4 is ^v^v 4132 which is consistent with str predictions for 1 and 2, but not 3 or 4. The strand 2-3 pairing did not have an Hbond constraints in try4, but S45-T70 and N47-E68 paired by themselves. The strand 1-4 pairing K10-F106, T12-D104, L14-V102 has register off by 1 from the constraints provided for try4. Maybe we should try extending the pairing seen in try4 to make a better model with this topology. Putting together the constraints derived from the alignment seen in try4-opt, naturally try4-opt scores the best of any of the existing decoys. Let's do a try7 run to see if we can improve things (particularly the chain breaks). I'll also set up the script to save the template atoms for faster loading on future runs, and sun scwrl on each iteration to try to keep the sidechain packing from interfering with the backbone folding. The big question for us is whether this antiparallel topology is correct, or whether we should be playing more with the ^^vv topology. We could explore the ^^vv topology more by using CB constraints instead of Hbond constraints---they are less sensitive to getting the phase of the strands exactly right. Perhaps we need to create new files with the constraints for the different topologies, and just use an include command in define-score.script. That will make it easier to switch between different score definitions. 20 July 2002 Kevin Karplus Using the constraints in try7-4132^v^v.constraints, T0135.try7-scwrl.30.30.pdb is the best scorer (outperforming try7-opt-scwrl, which accidentally overwrote try7-opt). In try7-opt, there is a pretty bad break between H65 and A66, R100 seems to have twisted around to the wrong side of the sheet, and the whole thing is not packed as tightly as I'd like---still it is more convincing than anything else I've seen for this target. Perhaps we should do a run with just the 1-4 strand pairings, and see what comes up out of the alignments and fragments. Perhaps this would help us decide what the right sheet toplogy is. Using the try8-14^v.constraints, the best current decoy is still try7-scwrl.30.30, so it is likely that the try8 run won't find us anything really great. (Try7 started from the try4 conformation, so was mainly doing tweaking, while try8 is starting from scratch.) I've also submitted the try7-opt conformation to VAST (ID: VS29918 Password: casp5t0135). This should give us an alignment to a real structure that might provide a better core alignment than the ones we've been working with. The best alignment is to 2bopA, and it looks quite good, even having 16.7% identity and 1.7 rmsd over 54 residues, also good are alignments to 1a7gE and 1fj7A all in SCOP class d.58 (d.58.7 or d.58.8). 2bopA d.58.8 1qupA d.58.17 1qd1B d.58.34 1a7gE d.58.8 1fj7A d.58.7 1qm9A d.58.7 1bs0A c.67.1 1scjB d.58.3 1fe4A d.58.17 1b3tA d.58.8 1cc8A d.58.17 1kp6A d.58.25 1mla d.58.23 1h6kZ d.58.7 ... I should grab some of the alignments out of the CN3D alignment editor and use them as alignments to try in undertaker. Note: our 3rd highest hit (1ha1) is from d.58.7.1, so this is probably the source for that fold, though 1f9fA (d.58.8.1) is also one of the high hits, as is 1fj7A (d.58.7.1). Also d.58 (1louA) is the (weak) consensus of the CAFASP servers. Unless something else good emerges, we'll probably go with a d.58 prediction of some sort---if we're wrong, at least we'll be in good company. It seems like even erroneous constraints (as in try4) drive the sheet formation faster than a smaller set of supposedly correct constraints---try8 is not getting nearly as many "good" structures as try4, when judged on the try8 scoring function. This suggests a strategy when trying to collapse beta-sheets---add some arbitrary "collapsing" functions (like keeping the centers of the strands near each other) and see what emerges, then try refining the constraints to get cleaner sheets. 21 July 2002 Kevin Karplus try8 never got as good a score as try7, but it did beat try4. try8-opt-scwrl tries pairing D19-H65 and F17-F67, with no other sheet-forming hbonds. Note that these Hbonds are NOT ones that it was looking for, and are off by 2 from the ones try7 was looking for (T15-F67, F17-H65). try8 looks like a dead end. I picked up a lot of the VAST alignments to try7-opt, and tried editing a few of them to lengthen the aligned regions---particularly in places where try7-opt had bad breaks. I selected alignments by VAST P-value, by number of aligned residues, and by %identity. For try9, I'll keep the limited constraints of try8, and try inserting the VAST alignments (but not use try7-opt as a starting point, since the clash reduction and optimization done there makes it hard to add alignments or fragments). Hmm---minor problem. The files passed through a macintosh, since I was working at home, and so are not proper UNIX files. I had to use emacs to read the files (emacs understands MAC files), then copy the contents to another buffer (where emacs assumes UNIX format), then save the file. Also, before running try9, I changed the BEST_EVALUE threshold in Makefile to allow in more alignments in the default scripts, remaking T0135.t2k.best-scores.rdb T0135.t2k-2track-undertaker.a2m and everything that depends on them. Scores on the try9 run are looking much better than the try8 run---it is almost certain to do better than the try4 run, and may compete with try7. Hmm, looking at a couple of the early iterations, it seems that strands 413 are joining nicely, and the predicted bend in the between 3 and 4 is modeled well, but strand 2 is way out in space. There are double H-bonds between T12-D104, L14-V102, R16-R100, T15-F67, L13-S69, H11-F71, which are compatible with the constraints from try7. If try9 doesn't fix strand2 in the final pool, I'll try using the try7-4132^v^v.constraints, or creating a new set of constraints based on what I see in the best of the try9 structures. 21 July 2002 Kevin Karplus try9-opt-scwrl scores very slightly worse than try7-opt, but does not have the strand 2 fixed. I guess I'll have to do another run, with the try7 constraints. With this scoring function, try9 does rather poorly (since the constraints for try2 are not met). I'll seed the try10 run with a couple of the better-scoring try9 decoys, but not with the try7-opt decoy, since it is a local minimum in the scoring function and may keep the algorithm from exploring more of the structure space. 21 July 2002 Kevin Karplus try10-opt-scwrl is not quite as good a score as try7-opt, but comes fairly close, which is a bit surprising as strand2 has still not attached to the sheet. Perhaps I should try another run, with both try7-opt and try10-opt-scwrl as initial conformations, seeing if some crossover action will produce a better model. Of course, what we really need is a double-crossover, with the child of AAA and BBB being ABA, since the 1st and 3rd strands are well placed, and only the 2nd one is badly placed. This operator would be a bit of a nuisance to implement. 21 July 2002 Kevin Karplus try11-opt scores slightly better than try11-opt-scwrl---SCWRL reduces clashes, improves the rotamer probabilities slightly, and reduces the radius of gyration, but increases all the burial costs (except gen6.5). I suspect that the scwrl result is slightly better---the extra burial value probably comes from sidechains bumping into each other keeping the helix too far from the sheet. Try11 is celarly based mainly on try7--it does not have the desirable bent helix of try 9, and it still has the bad break between H65 and A66. The structure is a little bit loose and "foamy"---I wonder how I can induce a tighter packing. I'll use the sumperimpose.under script to superimpose try9-opt-scwrl, try10-opt-scwrl, and try11-opt-scwrl, then piece together a max-and-match PDB file to use as a starting point. Unfortunately, the crufty old version of rasmol that runs on macs does not allow viewing multiple models, so choosing the pieces will have to wait until I can see them on my Linux box. 22 July 2002 Kevin Karplus T0135.try9-10-11.super superimposes 3 decoys. model 1 is called T0135.try9-opt-scwrl.pdb model 2 is called T0135.try10-opt-scwrl.pdb model 3 is called T0135.try11-opt-scwrl.pdb Note: these break reports use a 0-based numbering system---that should be changed to use the PDBNum numbers. T0135.try9-opt-scwrl.pdb has 2 breaks T0135.try9-opt-scwrl.pdb breaks before 67 with cost 0.0726077 T0135.try9-opt-scwrl.pdb breaks before 97 with cost 0.0855783 T0135.try10-opt-scwrl.pdb has 2 breaks T0135.try10-opt-scwrl.pdb breaks before 66 with cost 0.049496 T0135.try10-opt-scwrl.pdb breaks before 98 with cost 0.0423624 T0135.try11-opt-scwrl.pdb has 3 breaks T0135.try11-opt-scwrl.pdb breaks before 41 with cost 0.034909 T0135.try11-opt-scwrl.pdb breaks before 65 with cost 0.0886167 T0135.try11-opt-scwrl.pdb breaks before 100 with cost 0.0609589 For the helix or helices from K74 to somewhere in the low90s, I like model 1 (try9-opt-scwrl) best, but I could change to model 2 easily at L97. For strand2 (M43-H47), I like model 3 (try11-opt-scwrl) best, since it is the only one that really forms the Hbonds. I improved the superposition algorithm used for superimposing the models, and now have a much easier time doing cut-and-paste. Let's try Cut-and-paste strand1 -R16 model1 helix1+strand2 F17-G53 model3 minihelix+strand3 M54-E72 model2 helixcluster S73-T96 model 1 strand4 L97-* model1 22 July 2002 Kevin Karplus try12-opt-scwrl, started from the cut-and-paste model, scores almost (but not quite) as well as try11-opt-scwrl. There are 5 breaks, including bad ones at 20-21, 66-67, and 97-98. I wish ReduceBreak were working better---then maybe we could seal up the gaps and have a good model. I tried using the new pred_alpha2 cost function in place of alpha and alpha_prev---this makes the gap between try11-opt and try12-opt-scwrl somewhat larger. I'll try a new run starting from try12-opt-scwrl with the new cost function, and with priors for ReduceBreak and CloseGap increased. Already on the first iteration it gets a better score than try11-opt, the previous best. 23 July 2002 Kevin Karplus try13-opt is the new best score (better than try13-opt-scwrl). There are still 5 breaks: 24-25, 41-42, 66-67, 72-73, 97-98. The worst of them is the 97-98, with 66-67 close behind. The spacing at 97-98 is not too bad, but the OG of S98 is making a bond to C L97---changing the psi angle of S98 would almost close the gap. The one at 66-67 is a large gap in mid-strand---one that I would have thought ReduceBreak could fix. The helices are packed rather loosely against the sheet, though it looks like they could nestle in closer with a few sidechain rearrangements. I looked at the superposition of try13-opt, try12-opt-scwrl, try11-opt-scwrl, try10-opt-scwrl, and try9-opt-scwrl, to see if I could find a way to close the gap at 66-67. It looks like about the best I can do is to copy 64-66 from try10-opt-scwrl into try13-opt. This creates a new (smaller) gap, which I hope will get swept into the sheet by CloseGap. I created this chimera in T0135-cut-and-paste-2.pdb It doesn't score as well as try13-opt (or many of the other try13 runs), mainly because H65 is still turned the wrong way, and sliding it down doesn't flip it over. Just noticed that I had the score function defined wrong---forgot the coefficient after pred_alpha2, so the weights for it and contact order were screwed up. The same observations apply after fixing the score function. 23 July 2002 Kevin Karplus The best-scoring decoy is now try14-opt. There are still 5 breaks, and the one at 98-99 is particularly large. I got a couple more alignments from VAST to one of the early iterartions of try14 (cs29973, password casp5t0135), and so for try15 I'll start with just alignments, not seeding with a conformation. I don't expect this to do phenomenally well, but I hope to be able to do a cut-and-paste between try14-opt and try15-opt. 24 July 2002 Kevin Karplus Try15-opt-scwrl does not score as well as try14-opt, but has only 3 breaks, two of them (around 65 and 96) smaller than the corresponding breaks in try14-opt. I should copy H65 and A66 from try15-opt-scwrl to try14-opt, but the break around 96 looks better in try14-opt, despite the worse score (try15 had messed up the helix to reduce the gap.) I should also copy Y29-M43, to close the gap at 41. In try16 iterations, the H65 and A66 keep getting attached to T64 instead of F67, even though the break at A66-F67 is much smaller. Perhaps this has to do with the way AlignedFragments::merge_all_short_segments works, which tries to merge with the longer neighbor first---it may be better to merge with the closer segment first. I killed try16 and will try again (try17) with merge_all_short_segments changed. Hmm, this isn't helping, since the cost of the break at 64 is higher than the cost at 66, so undertaker still favors joining in the way I like less. How can I force the right join? I'll let try17 run while I think about it, since on the first iteration it came up with a new best score. Hmm---I'll try choosing randomly which segment to merge with, favoring the closer one. I'll run this as try18, but still let try17 finish, since it may do better. The first 4 iterations of try18 all merged the short segment the way I DON'T like---I'll have to see if any of the later ones managed to sample the way I DO like. 24 July 2002 Kevin Karplus The new best scorer is try18-opt (not try18-opt-scwrl), but it still has the bad break after A66. I can also see how the helices and sheet should inter-digitate, but I don't know how to get the packing to happen. 25 July 2002 Kevin Karplus Maybe the new JiggleSubtree operator could help improve the packing? Yes, it seems to---the best new score is T0135.try19.2.40 (undertaker crashed, perhaps because I had recompiled it while running, so there was no try19-opt. Let's try another run, with OptSubtree as well as JiggleSubtree. try20-opt is a new best score, beating even try20-opt-scwrl. The scwrl run has a better score for most measures, but slightly worse on contact order. Perhaps I should do one more run, reducing the weight of contact order, and using the new OptSegment and OptClash operators to try to pack the helices tighter against the sheet. Maybe I should temporarily add a constraint to try to get the helices to pack better---maybe CD1 of L81 against C of L101 and C of V102 and CD1 of L94 against CB Q99 and CG2 T15. Adding the constraints still leave try21-opt as the best scorer. 26 July 2002 Kevin Karplus I tried twiddling the packing constraints a bit, using Constraint 750 784 2 3.3 5 // CD1 L94 CB Q99 Constraint 750 115 2 3.2 5 // CD1 L94 CG2 T15 Constraint 806 691 2 3.2 5 // CD1 L101 CG L86 Constraint 806 750 2 3.2 5 // CD1 L101 CD1 L94 Constraint 100 651 2 3.2 5 // CD1 L13 CE2 Y80 Constraint 100 659 2 3.2 5 // CD1 L13 CG L81 Francisco Useche rebooted the machine that try22 was running on, without telling me, so I'll have to start over on a different machine. 26 July 2002 Kevin Karplus The try23-opt conformation looks pretty good, but is still a little looser than I'd like. Perhaps we can try to get some contact between the aromatics F17 (CE1) and F93 (CZ), F46 (CE1) and Y33 (CE2), F71 (CE2) and Y80 (CD2). Adding these constraints still leaves try23-opt on top, so let's try another run, with the score parameters adjusted to try to make packing more important. Now try23.17.40 and try23.16.40 beat try23-opt, so let's add them as starting points. Hmm, try24 seems to be spending all its time doing CloseGap. Maybe I made the prior probability of that too high. The new best scorer is try24-opt, and it does look denser, but still not as dense as I'd like. Increasing the weight for the constraints and decreasing slightly the weight for breaks and clashes should favor denser packing (at least near where I imposed constraints). Doing so makes try23-opt score best again. I'll try tweaking the weights a bit, and I'll turn off a couple of constraints near the ends of the pair of helices I want to pack in tighter, to allow a bit more flexibility (both K18-S98 bonds and H11 N-O F71). Let's start with both try23-opt and try24-opt and see what happens. 27 July 2002 Kevin Karplus try25-opt is new best score. If we turn off the "extra-packing.constraints", then the best is try25-try23.15.40, with try25-opt as second best. After re-optimizing (without packing constraints) new best is try26-opt-scwrl (but after reading the pdb files back in, the cost increases and try26-opt scores better). I'll submit try26-opt-scwrl, and replace it if I can come up with a better one. There is still a little looseness in the helix packing, and K74-Q78 has unwound a bit. Let's try pasting in a helix (say from try13-opt) for 74-81. The helix gets unwound in a very similar way in try27-opt, but try27-opt does score somewhat better than try26-opt, so I should resubmit.