Fri Jul 5 03:27:58 PDT 2002 5 July 2002 Kevin Karplus The top hits for T0146 are all over the place, and none of them seem particularly strong. The CAFASP servers are also all over the map. 10 July 2002 Kevin Karplus The T0146.t2k.undertaker-align.pdb.gz has some alignments that are not superimposed, apparently because model 1 (1a7cA) has no residues after 86, and some of the models (such as 66 1dxrH) have no residues before 190. All the sequences in T0146.t2k.a2m.gz are full length (or nearly), but the sequence is long enough to be a multiple domain protein. 11 July 2002 Kevin Karplus In T0146.try1-opt.pdb we have a nice antiparallel beta sheet from H56 to I84, but the rest of the beta sheets are blown up, with some of the strands coiled into helices. I'm not sure what to do to get a better structure (though the predicted alpha and H-bond scoring functions should help when they are done). 12 Aug 2002 Kevin Karplus Reran make with new template library. No change in top hits. Modernized define-score.script and undertaker.script, and reran undertaker. Fri Aug 16 15:11:56 PDT 2002 Kevin Karplus Best score is try2-opt-scwrl, but it looks poor---beta sheets have been exploded and there is a strand which was predicted to be helical. Robetta broke it into two domains (between P156 and A 157) and got something for each domain. That particular break looks bad---it is in the middle of a predicted hairpin. Perhaps we should try doing 3 subdomains to generate alignments: 1-180 120-220 180-325 We could also toss the robetta models in to the optimization---they may score much better once the two halves are pasted back together. I'll start the optimization with robetta models tossed in while the alignments are being generated. Fri Aug 16 15:56:20 PDT 2002 (oops, I accidentally made a 1-80 subdomain instead of 1-180. Oh, well, might as well include both.) Fri Aug 16 16:22:21 PDT 2002 The domain boundary for the first domain seems to be around 130-140, based on the alignments found by target2k for t0146-1-180. There is a a turn predicted for P137-E139, so that might be a good domain boundary. 17 Aug 2002 Kevin Karplus The try3 run (starting from Robetta models) gets the best score (try3-opt) which seems to be based on a robetta2 model. It has a lot of hairpins, which agree fairly well with predicted structure. In try4-opt-scwrl, based on subdomains, the structure for 228-325 looks good, though there are some disagreements with secondary structure prediction. The strands at the beginning (before about G66) are separated from the initial strand---probably getting too high a weight for contact-order. If we turn off contact-order in the score function and rescore, the best is try3-opt. Let's do another run starting from try4-opt-scwrl and try3-opt, but try doing alignment insertion into try3-opt. 17 Aug 2002 Kevin Karplus The new best is try5-opt. We have not recovered the sheets that were in robetta2.pdb still. Perhaps we should do another run from the robetta models, with OptSubTree turned up very high to paste the half-models back together. Alternatively, we could try to guess some Hbonds (or CB constraints) from the robetta models and put them into the score function. I'll try the high-OptSubtree from just the rosetta models first. 18 Aug 2002 Kevin Karplus Best is still try5-opt, but try6-opt-scwrl has some good-looking substructures---possibly better than try5. Perhaps we should grab some constraints from try5 or try6, and apply them to try4 (which doesn't have any rosetta models in it)??? In the meantime, let's start a run starting from try6-opt-scwrl, to see if it can be repacked to better than try5. 18 Aug 2002 Kevin Karplus try7-opt-scwrl is new best. It looks like the hairpin T107-L117 has been pulled out of a barrel. I'll need help guessing some more constraints for this one! 19 Aug 2002 Jenny Draper Attempting constraints for putting T107 side of hairpin into a sheet/barrel by adding a T107-W83 strand pairing. 20 Aug 2002 Jenny Draper try8-opt-swcrl is new best-scoring, but still looks like a mess; lots of strands are still wound up at helices, and my constraints were ignored. Trying again... 21 Aug 2002 Jenny Draper try9-opt-scwrl new best-scoring, and seems to be pulling more sheets together. It's still a tangled mess. Try10 is attempting to optimize try9 a little, and then I'll look for more constraints. 22 Aug 2002 Jenny Draper same as yesterday. try11 now trying to improve on try10... 24 Aug 2002 Jenny Draper same story, again. I think I'll need to try one of the perl scripts to move a piece out of the way in order to get the sheet to form... 26 Aug 2002 Jenny Draper Running try12 off of T0146-moved.pdb, hoping to get more sheet forming. Currently, I think the best models are try11, try5, and try4. I really like robetta2 & robetta3; all the alignments are to structures with one large sheet... but I just can't seem to get one to form?! 26 Aug 2002 Jenny Draper Attempting a new run off of "TO146-robetta-moved.pdb", which combines robetta3 domain 2 with robetta2 domain 1... 27 Aug 2002 Jenny Draper 9:30 am Try13 scores bad (it still has some bad breaks), and it has blown up the sheet, but the overall structure looks more believable (and more like the single-sheet structures of the top hits). I'm going to try again straight from robetta-moved, but this time with the sheet constraints added... 27 Aug 2002 Jenny Draper 3pm Try14 scores best with the new constraints from the robetta models. It has a really nice sheet for strands 1-6 from robetta2's domain 1; but it's blown apart the sheet for domain 2. Try15 will try forcing the central predicted strands (helical in the robetta model, and in most of our trys) into the sheet, hopefully providing a better connection between the two "domains". Update on the "current best models": try14, try 11, try5/try4. Try11 is the "most different", since try5/4 are based, like try14, on the robetta structures; however, I don't trust try11. To submit try11, I'd do a run or two on try12, trying to close the breaks in it -- but I'm moving in the try14/robetta direction. 27 Aug 2002 Jenny Draper 9pm Try 15 blew apart; running try16 off of try15 & robetta-moved, including all the read-alignments; hoping to see things pulling back together. I'm not certain about the constraints I have for the center sheet; right now I have 3 constriant files I'm using: try14.constraints (taken from the robetta models), try15-certain.constraints (trying to hold strands out; forming strand 7-8 hairpin), and try15-guess.constraints, which are my best current guess as to how to connect the two robetta sheets. 28 Aug 2002 Jenny Draper Try 16 is still killing robetta's "domain 2". It looks like it's starting to straighten out the sheets in the middle, though. Running try17 off of it... 28 Aug 2002 05:23 Kevin Karplus try17 still running. Current best is try17-try16-moved.5.40, which has not put the sheet back together yet. Up to about residue 100 it looks ok, but things fall apart after that. Maybe the strand around K105-L116 should be parallel to F81-R87, with A147-L151 parallel to that. The helices probably don't run from one strand immediately to the next, but figuring out the topology will be hard. 08:45 try17-opt is new best (opt-scwrl is identical---scwrl failed). Constraints needed to pack sheet: force Q288-N296 to be helical make I304-R308 antiparallel to D263-K267 straightness constraint for R245-S253 straightness constraint for V106-P110 Added these as new try18.constraints, borrowing constraints fromtry14 and try15-certain, fixing some typos on the way. 28 Aug 2002 Jenny Draper The try19 run finished; it's constraints were a mess, w/ conflicts everywhere; had pairings of strands 10-11, 10-14, 11-14, 14-15... but they helped me with my incorrect drawings of strand 10! Trying try20, with a new all antiparallel strand-turn-strand sheet structure for strands 10-15 (A246-E317, robetta domain 2), and then adding strand 9 (does it even exist?) onto strand 10, parallel. my strand-numbering system, as of try20: 1: T19-M21 6: K105-D111 11: E262-M268 2: A27-I31 7: A147-W152 12: G276-K281 3: H56-H61 8: F160-D165 13: V291-N296 4: S70-D77 9: A215-F223 14: I304-R308 5: F81-E85 10: A246-A252 15: T313-E317 Date: Wed, 28 Aug 2002 15:51:36 -0700 From: Kevin Karplus To: karplus@soe.ucsc.edu, rachelk@soe.ucsc.edu, weber@soe.ucsc.edu, learithe@cats.ucsc.edu, yael@biology.ucsc.edu, baertsch@soe.ucsc.edu, rph@soe.ucsc.edu, afyfe@soe.ucsc.edu, jcasper@soe.ucsc.edu, oscarhur@soe.ucsc.edu Subject: t0146 In T0146.try18.3.80.pdb we look like we're getting the beginnings of a beta sandwich. Jenny---do you see anything we can do to encourage it? Is it compatible with your try20 constraints? Date: Wed, 28 Aug 2002 15:55:32 -0700 (PDT) From: Jenny Draper To: Kevin Karplus I was just looking into that. Note that I think it is third from the top (the top 2 are from try3) in score-decoys at the moment, which was scored using the try20 constraints in the define-score.script; so it's probably good.... -Jenny From: Jenny Draper To: Kevin Karplus Subject: Re: t0146 Hmm. T0146.try20-try17.0.80.pdb, new best scoring and first of the try20 decoys, appears to blow things up again... I think I'm going to kill try20 & start from T0146.try18.3.80.pdb, which wasn't in the initial pool for try20... -Jenny Date: Wed, 28 Aug 2002 16:03:24 -0700 (PDT) From: Jenny Draper To: Kevin Karplus Subject: Re: t0146 Actually, on second thought, I'll just make a new run, try21. -jenny 28 Aug 2002 Jenny Draper 4pm Trying try21, which starts from T0146.try18.3.80.pdb; hopefully, it will start forming the beta sandwich that seems to be starting in try18.3 28 Aug 2002 Jenny Draper 5:45pm Trying try22: //try22.constraints //trying a new sheet structure, based on try18.3.80 // 10 11 12 helix 14 15 // /\ /\ \/ \/ /\ // where 10-11 (G252-R244 - E262-M268), // 11-12 (E262-M268 - R273-K281), // and 14-15 (I304-R308 - T313-E317) are all hairpins 28 Aug 2002 Jenny Draper 5:50pm The current story: try20 is running; it contains the all-antiparallel sheet for residues 216-317 discussed in the meeting. it seems to be forming something new, that could be good, could be bad try21 is running using the same constraints as try20, only starting from try18.3.80, to see if this will get the try20 constraints to form a nice beta sandwich try22 is running off of try18.3.80, using an entirely new set of constraints (try22.constraints), which seemed to fit the structure in try18.3.80 NOTE: I discovered a conflict in the restraints for tries 20 & 21: I had simultaneously added constraints to make strand 13 (V291-M295) both helical, and a strand in the sheet. Oops. Note that this section is uncertain in the alignments as to whether it is a strand or a helix... 28 Aug 2002 Jenny Draper 6:20pm I expect that tries 20&21 won't get very far, due to the conflict in the constraints for strand 13; looking at their initial conformations confirms this, so: try23 is now running, off of the intial conformations of try20 & try21, as well as try18.3.80. It is using try23.constraints, which are identical to the constraints for try20/21 (try20.constraints), EXCEPT for strand 13 not having a conflicting "make this a helix" constraint... the constraint files to use to check the best decoys are: for tries 23, 20, 21: try23.constraints for try 22: try22.constraints 28 Aug 2002 Jenny Draper 8:40pm the overwritten, decent try21 file I had is now named decoys/T0146.try21.0.80.rastop.pdb Wed Aug 28 20:49:39 PDT 2002 Kevin Karplus With no constraints, T0146.try11-al10.1.40.pdb scores best, but it has nothing but hairpins. With try22 constraints, try21.0.80 scores best. About 6-strands of sheet, but Jenny likes try18.3.80 better for the beta-sandwich model, though it doesn't score as well. With try21 constraints, try20-opt-scwrl scores best. (oops, haven't included the one Jenny thought was best---rescoring). It's not too bad as a single sheet. Jenny also suggested try3-robetta2.1.60, which is fairly compact and has a lot of sheet. One strand is wound into a helix. Current ordering: try3-robetta2.1.60 try18.3.80 try20-opt-scwrl try11-al10.1.40 try21.0.80 Wed Aug 28 Jenny Draper 11pm Trying two last attempts (try25 - starting from these 5 models), (try26 - starting from try3-robetta2.1.60) using new try25.constraints based on try23.constraints; hoping to basically improve on try3-robetta2.1.60 in the next half-hour... Also trying try27, which is the same as try26 except it uses try27.constraints, which fixes me forgetting to add a constraint to make V115-G121 a helix (robetta makes this "strand 6.5"), so it's in our top model... Wed Aug 28 Jenny Draper 12am Well, none of the above tries really did anything; none of them seemed even improved enough to be worth replacing model1 with. I guess the really short undertaker runs just didn't have enough time to sample any new space. Just for fun, I'm doing a long run, try28, to learn whether or not my try27 attempt would have done any good, given enough time. :) 26 November 2002 Kevin Karplus best whole chain was try1.5.20 best domain _1 was try2-al10+T0146-1swuA-2track-protein-STR-local-adpstyle5.pw.a2m.gz:1swuA.13.60 best domain _2 was try16-10 best domain _3 was try13.2.40 best domain _4 was try13.7.40 CA RMSD whole _1 _2 _3 _4 best 19.0241 14.4597 10.0453 14.4931 10.5645 1 try3-robetta2.1.60 20.1361 17.4053 12.0392 17.3951 12.6009 2 try18.3.80 21.3237 20.5213 12.8935 15.3085 14.3686 3 try20-opt-scwrl 21.6017 20.8086 12.8589 18.3672 14.3224 4 try11-al10.1.40 21.7395 17.4999 13.0246 15.7971 13.3209 5 try21.0.80 21.6111 20.4989 13.4218 15.0706 14.1969 28 Nov 2002 Kevin Karplus Looking at domain-based superpositions For domain 1 (1-24,114-196), model 4 is best, model 1 second best. Real beta sheet is antiparallel 345216, with 1 coming from a distant part of the sequence. Strands 2,3, and 6 were not predicted by str neural net. Model 4 has no beta sheet here. Model 1 has one hairpin (or strands 4-5). For domain 2 (25-113), model 4 is best, model 1 second best. In real_2 the sheet is antiparallel in order 51432 and the predicted helix between 1 and 2 is only half there, to let 1 and 2 run in opposite directions. In model 1, the sheet is order 1^2v3^4v5v (so 234 are ok, but 1 needs to be moved between 4 and 5). Also 5 is almost at right angles to the sheet, not really parallel or antiparallel. In model 4, only 234 are in the anti-parallel sheet. 1 separated about the right distance but is oriented the wrong direction and 5 is well-away from the sheet. For domain 3, model 5 is best, model 1 second best. Not much similarity between the correct structure and model 1, and secondary structure prediction is poor. Model 5 is not really much better, still having wrong secondary structure. For domain 4 (197-243 not new fold?), model 4 is best, model 1 second best. Secondary structure prediction ok, but even model 4 is not a good fit.