Wed Aug 4 15:14:42 PDT 2004 T0262 DUE 27 Aug 2004 Fri Aug 6 14:08:46 PDT 2004 Kevin Karplus Because of a bug I introduced to Make.main, I had to remake this prediction. This looks like a fold-recognition model with 1top (a.39.1.5) as the template. Problem: t2k and t04 hits do not agree: t04 puts 1q0uA (c.37.1.19) as the top hit. We should probably use the rr constraints (when they are generated) to help choose between models. Could there be 2 domains (the protein is long enough)? I'll make extra alignments for the top t04 hits as well as the top t2k hits. Fri Aug 6 21:57:07 PDT 2004 Kevin Karplus Unfortunately the rr constraints are VERY weak, so probably won't help. The try1 model looks pretty bad. The helices form, and it is fairly compact, but the strands aren't extended, burial is pretty much ignored, and the protein is foamy. All the alignments in T0262.t2k.undertaker-align.pdb are very short, so are not giving much structure. Thu Aug 12 11:49:17 PDT 2004 Sol Katzman The try1 model is nearly all helix. But both t2k and t04 have lots of strand predictions that largely agree. One severe disagreement is the region W120-L127 which t2k.str2 has helix, t04.str2 has strand. (Note that t04 bys,stride,alpha,dssp do have HelixConstraints E119-L127) Looking at the t2k and t04 hits for the 100-40-40-str2+CB_burial_14_7 models there are numerous SCOP domains, that we can filter assuming that we believe the large number of anti-parallel strand predictions from both t2k,t04: a.39.1.5 -- all helix, EF-hand (1top,1ncx etc.) c.37.1.19 -- parallel beta sandwich (1q0uA,1qdeA etc.) c.51.1.1 -- mostly parallel sheets (1h4vB[326-421],1adjA[326-421] etc.) c.94.1.1 -- mostly parallel sheets (1aljA etc.) d.104.1.1 -- large mixed sheets (1h4vB[2-325],1adjA[2-325] etc.) It seems that something like d.104.1.1 is the template we want. I will set up some strand constraints, and create a rasmol script pointed to by 'strands' to define them, mostly from the t04.str2 prediction: s1 V92-L94 s2 W99-R101 s3 I122-L127 # only predicted by t04.str2, others have helix s4 R133-R137 s5 E140-Y145 s6 I149-P155 s7 L164-H166 s8 L187-L191 s9 A210-V214 s10 K225-R228 s11 V233-V236 For the fairly obvious turns, I can make anti-parallel sheet constraints: s1 ^v s2 s3 ^v s4 ^v s5 ^v s6 s10 ^v s11 These will be included in try2. Since s3 is questionable, and since try1 made a s1 ^v s2 sheet, I will in parallel with try2, make another run that eliminates s3, and uses the try1 s1 ^v s2 constraint. That will be included in try3 For both try2 and try3, I increased the weight for constraints (10 -> 25) Mon Aug 23 10:50:04 PDT 2004 Sol Katzman Not getting much hbonding in the requested sheets, either in try2 or try3. For try4, use the weights that are in vogue in the later targets (sidechain lower, dry and wet higher), as well as increase hbond_geom_beta*, and increase the individual weights for the SheetConstraints. Tue Aug 24 10:07:58 PDT 2004 Sol Katzman At our group meeting a couple of things were suggested: 1) break this target into several domains 2) try to get the 4 strongly conserved histidines (H85,H147,H166,H190) to cluster. I created a subdirectory 75-200 and ran the base make on it to get 75-200/try1. As for the whole model, the top hit is: a.39.1.5 -- all helix, EF-hand (1top,1ncx etc.) Looking at 1top and 75-200/try1 there is something to see. 1top consists of two domains, separated by a very long (7 turns) straight helix. Each domain contains two separate EF Hand motifs. The 1top structure binds 2 Ca ions in one domain, and a SO4 in the other domain. I presume that this is an artifact of the crystallization, and that each domain could bind 2 Ca ions. The 3.0 Angstrom neighbors of each Ca ion are a bunch of acid residues. See Branden and Tooze, 2nd edition figures 2.13 and 6.21. Viewed from this perspective, each of the 4 conserved histidines in T0262 could separately participate in one of 4 binding sites, so trying to cluster them would definitely be erroneous. Looking more closely at the structure of 75-200/try1, it actually corresponds fairly well to 1top, with the 7-turn linking helix from 1top reduced to a very short segment. This was obscured in the full T0262 tries by extra helices in the preceding (1-74) and following (201-256) regions. One problem with this theory is that the conserved H147 is just about where I would like to split 75-200 into two subdomains. on the other hand, the strongly conserved R133 could participate in ion binding in the first such subdomain. For 75-200 try2, use the try1 sheet constraints, and increase the cost of breaks somewhat as there are some bad breaks in try1. Use the t2k.dssp-ehl2 constraints, except for the strand constraints that do not correspond to the try1 sheets. Tue Aug 24 19:47:47 PDT 2004 Sol Katzman Looked at 75-200/try2 with Kevin. Since the putative EF-hand binding sites have a distinct dearth of acidic residues, it seems unlikely that this is really the function of this protein, despite its being the closest template family. So for 75-200/try3 I am using constraints to keep the NE2 atoms of the four conserved histidines (H85,H147,H166,H190) together. I also created two other subdirectories for domains which will overlap with 75-200, namely 1-85 and 190-256. In 1-85 we find there are no good templates, the best t2k E-value is 1.6E+01. 1-85/try1 does form a small antiparallel sheet K62-V66 ^v L75-M71 so include that constraint in 1-85/try2. In 190-256 there is also not much to go on, the best t2k E-value is 8.5E+00. 190-256/try1 does not look like much of anything. For 190-256/try2, I will include the s10 ^v s11 constraint from the whole target. Wed Aug 25 09:37:38 PDT 2004 Sol Katzman 75-200/try3 did group the histidines as desired, although it introduced a number of breaks. 1-85/try2 does not look much better than try1. 190-256/try2 formed a little bit of the s10 ^v s11 sheet that we were looking for, but is still not a great model. Created a chimera from 1-85/try2 + 75-200/try3 + 190-256/try2: merge.d1-try2.d2-try3.d3-try2.under printAllConformPDB \ T0262.chimer.d1-try2.d2-try3.d3-try2.pdb \ superpose \ atom T78.CA atom L79.CA atom A80.CA atom G81.CA \ atom T192.CA atom S193.CA atom S194.CA atom L195.CA The superposition is good for 75-200 + 190-256. In particular one of the conserved histidines H190 overlaps. But 1-85 and 75-200 do not overlap, with H85 in two completely different places. Something wrong with my undertaker command? Wed Aug 25 12:45:49 PDT 2004 Kevin Karplus I'm seeing almost perfect overlap for T78-R84, so I assume that Sol means that the 1-85+75-200 overlap was fine, but the 75-200+190-256 failed. I'm not sure why that happened, but I suspect an undertaker bug having to do with incomplete conformations. I'll look into it. Wed Aug 25 13:02:28 PDT 2004 Kevin Karplus I found the problem, but have not fixed it---it would require an algorithm change to the method for figuring out superposition that will require some thought. As a quick workaround, putting the common part as the first conformation should fix the problem (the initial conformation is taken from the first one, and ones that don't have any atoms in common with it may end up locked into arbitrary positions). The merge.d2-try3.d1-try2.d3-try2.under does the different order, correctly creating T0262.chimer.d2-try3.d1-try2.d3-try2.pdb which has some really bad clashes. It may be possible to reshape it by optimizing (after cutting and pasting to make a single chain). Wed Aug 25 14:37:30 PDT 2004 Sol Katzman To avoid confusion, I renamed the two (3-model) superposition files created above to: (and I also edited the merge.*.under to use these names) T0262.superpose.d2-try3.d1-try2.d3-try2.pdb T0262.superpose.d1-try2.d2-try3.d3-try2.pdb After cutting out the overlap, and renumbering in rasmol, the actual (single model) chimera is here: decoys/T0262.chimer.d2-try3.d1-try2.d3-try2.pdb As Kevin pointed out, there are some bad clashes: 1-85 and 75-200 do NOT clash 1-85 and 190-256 clash a lot 75-200 and 190-256 clash a lot So the key is to move 190-256 if possible. I may pursue this with DeepView. Wed Aug 25 17:15:35 PDT 2004 Sol Katzman We never used the clustered histidine constraints on the whole protein. Turn down the previous (whole try4 etc.) weights on SheetConstraints and add the HIS constraints for whole try5, starting with TryAllAlign. For whole try6, use the same constraints as try5, but read in the 3 versions of the chimera that I modified in DeepView (dv1,dv2,dv3), as well as the unmodified (highly clashing) original chimera. (I cannot say that I am very fond of any of these): T0262.chimer.d2-try3.d1-try2.d3-try2.pdb T0262.chimer.dv1.d2-try3.d1-try2.d3-try2.pdb T0262.chimer.dv2.d2-try3.d1-try2.d3-try2.pdb T0262.chimer.dv3.d2-try3.d1-try2.d3-try2.pdb Wed Aug 25 18:10:24 PDT 2004 Kevin Karplus I picked up the templates from the three subdirectories and added them to MANUAL_TOP_HITS, and am making extra_alignments and all-align. When that is done, I'll start try7 (no SCWRL on all-align, since it will be so huge). The try7 score function has only the histidine packing (no strands or helices, since those seem to be inconsistently predicted between the whole protein and the subdomains). Of the current models, it likes T0262.chimer.dv1.d2-try3.d1-try2.d3-try2.pdb best, which appears to have been hand-crafted to have the histidines close. Thu Aug 26 09:52:07 PDT 2004 Sol Katzman Regarding chimer.dv1,dv2,dv3 -- they should all have the histidines close because they only moved residues 192-256 (and tried not to introduce a large break between 191 and 192). The chimer produced from the superpose (from which the deepview models in turn were constructed) did the cut and paste in the middle of the overlap sections, preserving residues 80-193 intact from the 75-200/try3 model, thus including the four histidines from that model. The try7 costfcn likes try7 best, then try6,try5,chimer.dv1,dv2,dv3. The unconstrained costfcn (nearly the same as try7 costfcn without the histidines, and a little higher soft-clashes and break weights) also likes try7 best, then try1,try6,try4. Rosetta likes the repacked models in the order try6,try7,try4,try5. Thu Aug 26 10:29:26 PDT 2004 Kevin Karplus I'd like to start a try8 run, like the try7 run from alignments, but with helix and strand constraints as well as the strong histidine-clustering constraints. I made a rasmol script "hist" that shows the conserved histidines, defining set "histcons" in the process. The try8 costfcn likes try7 best then try6. Thu Aug 26 13:59:01 PDT 2004 Kevin Karplus After doing the try8 optimization from alignments, the try8 costfcn orders: try7-opt2, try6-opt2, try8-opt2 (all fairly close in cost). I'm not sure what to do next on this one---pick models and submit? Try more runs? Thu Aug 26 14:41:46 PDT 2004 Kevin Karplus I'll superimpose the top 3 contenders and see what I think. (Rosetta dislikes least T0262.try6-opt2.repack-nonPC, but it hates them all.) I think I like try8-opt2 best of the bad lot, though it does not score as well---at least it gets some beta sheet pieces. The question is---do I try to polish it, or do I submit as is? The chance of significant improvement is small. The unconstrained costfcn likes best try7-opt2, try8-opt2, try1-opt2, try6-opt2, try4-op2.repack-nonPC. try4 may have the best secondary-structure match, but does not cluster the histidines. I'll submit try8-opt2 try7-opt2 try6-opt2 try4-opt2.repack-nonPC try1-opt2 There are no template alignments long enough to be worth submitting. Thu Nov 18 23:46:19 PST 2004 Martina Koeva Based on the smooth gdt scores: best sam-t04 15.3386 (also align2) best submit 14.8555 (model3) model1 10.7583 auto 12.7065 align 11.6800 robetta best 21.6417 (also robetta model1) robetta1 21.6417