Fri Jul 7 09:11:38 PDT 2006 T0365 Make started Fri Jul 7 09:14:06 PDT 2006 Running on cheep.cse.ucsc.edu Sat Jul 8 00:03:41 PDT 2006 Kevin Karplus No good hits in PDB with BLAST (best is 1pa4A, E-value 1.6) Modest hit with HMMs (1sumB, 2.2 e-6). Fri Jul 21 3:32 George Shackelford The structure in try1 is very interesting and almost certainly wrong. ROBETTA has this as a basic six helix bundle. I looked at the t2k.w0.5 and t06.w0.5 sequences and I believe the t2k better. I'll try pushing t2k as the basis for a solution. Mon Jul 24 19:54:43 PDT 2006 Kevin Karplus I agree that this is a 6-helix bundle. We should probably pick up some distance constraints from the well-aligned parts of the undertaker-align models. A few near the ends of each of the helix (say CB constraints with optimum distances picked up with Pick Distance in rasmol) would probably be enough to pin down the helices. Here are a few: L219.CB V138.CB 4.4 L219.CB I139.CB 4.7 V215.CB A135.CB 3.8 E213.CB M161.CB 5.2 A209.CB I168.CB 4.2 A209.CB A131.CB 5.5 A209.CB L165.CB 5.5 P114.CB D194.CB 3.8 P114.CB V195.CB 6.4 P120.CB L175.CB 4.9 L134.CB M161.CB 6.0 V33.CB V132.CB 5.1 L88.CB I139.CB 6.1 Tue Jul 25 16:35:26 PDT 2006 George Shackelford I'm going to give undertaker a chance to try again. I'm going to restrict to t2k fragments, t2k.ehl2, etc. While that is going, I'll take these connections under advisement. I'm taking the rr.constraints entirely for now. I should run traincontactnn using the t2k predictions and then cull them for pairs that are not along helices. try2 running on vashon. From what I see in 'best', this is picking up on either al6+all-align.a2m:1o5hA or al7+all-align.a2m:1ut0A with a very slim possiblity of all-align.a2m:1vctA Tue Jul 25 21:47:58 PDT 2006 George Shackelford Well that didn't work. Despite the fact that try1 was based on all-align.a2m:1oq9A, and try2 was based on the set above, the results appear to be the same. We're both interested in a simple six helix bundle and undertaker isn't finding one. I'm taking a look at these hits. 1oq9A apparently gets a partial match which can cause the model to spread out. 1oh5 is part of a dimer that is no bundle, and 1ut0A looks possible but it is no bundle either. Still the near predictions appear to twist about the helices suggesting a coil-coil structure. I am reluctant but will try the restraints that Kevin suggests. Wed Jul 26 13:16:46 PDT 2006 George Shackelford I have rerun the rr.constraints using the t2k files as the basis since I have more faith in the t2k predictions. The results are below. Next I have used the t2k ehl2 predictions to comment out those pairs that could not be near each other if the helix constraints are correct: Constraint A96.CB L208.CB -10. 7.0 14.0 0.581746351925 Constraint M78.CB I202.CB -10. 7.0 14.0 0.526920474913 Constraint I139.CB L179.CB -10. 7.0 14.0 0.441016781963 bonus # Constraint Y199.CB L208.CB -10. 7.0 14.0 0.431038828669 # Constraint M22.CB L32.CB -10. 7.0 14.0 0.397492770863 # Constraint L89.CB A99.CB -10. 7.0 14.0 0.386397137168 Constraint I54.CB L85.CB -10. 7.0 14.0 0.372859926316 bonus # Constraint F121.CB A135.CB -10. 7.0 14.0 0.365630585714 # Constraint I202.CB A212.CB -10. 7.0 14.0 0.365012542857 Constraint L165.CB L208.CB -10. 7.0 14.0 0.364571928571 bonus # Constraint I129.CB V138.CB -10. 7.0 14.0 0.360137430137 # Constraint L76.CB Q92.CB -10. 7.0 14.0 0.355100664227 # Constraint A96.CB V106.CB -10. 7.0 14.0 0.351235220244 Constraint I139.CB M161.CB -10. 7.0 14.0 0.347484215592 bonus # Constraint V132.CB L142.CB -10. 7.0 14.0 0.340471955224 # Constraint V25.CB F35.CB -10. 7.0 14.0 0.338924228514 # Constraint V205.CB V215.CB -10. 7.0 14.0 0.337865726997 Constraint I54.CB I95.CB -10. 7.0 14.0 0.337033377871 bonus # Constraint V25.CB F36.CB -10. 7.0 14.0 0.335470121086 # Constraint L89.CB I102.CB -10. 7.0 14.0 0.333988283925 # Constraint L125.CB A135.CB -10. 7.0 14.0 0.331954218367 # Constraint A29.CB T39.CB -10. 7.0 14.0 0.327771506383 # Constraint M78.CB Q92.CB -10. 7.0 14.0 0.325451480999 # Constraint L165.CB L175.CB -10. 7.0 14.0 0.323589745928 # Constraint F121.CB V132.CB -10. 7.0 14.0 0.322245916005 # Constraint I162.CB T172.CB -10. 7.0 14.0 0.318956605411 # Constraint L76.CB L85.CB -10. 7.0 14.0 0.317590797327 # Constraint L125.CB L142.CB -10. 7.0 14.0 0.316735035635 # Constraint I129.CB L142.CB -10. 7.0 14.0 0.315103090909 # Constraint L18.CB L32.CB -10. 7.0 14.0 0.314903578826 # Constraint V205.CB L219.CB -10. 7.0 14.0 0.313784886076 # Constraint L88.CB A99.CB -10. 7.0 14.0 0.313432513419 # Constraint R180.CB L208.CB -10. 7.0 14.0 0.312451912485 # Constraint L165.CB L183.CB -10. 7.0 14.0 0.311514971598 # Constraint Q92.CB I102.CB -10. 7.0 14.0 0.311081949112 # Constraint L198.CB L208.CB -10. 7.0 14.0 0.309909179882 # Constraint L85.CB A99.CB -10. 7.0 14.0 0.306855054022 # Constraint F77.CB A99.CB -10. 7.0 14.0 0.303809438936 Constraint I162.CB V195.CB -10. 7.0 14.0 0.303515752116 bonus Constraint I139.CB V158.CB -10. 7.0 14.0 0.301077258599 bonus # Constraint F77.CB L88.CB -10. 7.0 14.0 0.300940412739 (NOTE: I removed the -factor .05 when I failed to get any constraints!) Eliminating the unlikely ones results in rather drastic pruning. I think what we have left is useful. I'm going to include them in the constraints. I using these for a try3 starting from try2.under and .costfcn try3 running on vashon - has al7+all-align.a2m:1hw1A as parent Thu Jul 27 04:21:32 PDT 2006 Kevin Karplus Since George seems reluctant to try the obvious step that I suggested above, preferrring to ignore the alignments and use neural nets instead, I guess I'll have to set up the constraints myself, though I really don't have time to work on this target. I'm not sure *why* George prefers t2k on this target---it does not seem to have muchmore conservation in the sequence logo, nor a better match on conserved residues (except, perhaps F121). Thu Jul 27 04:36:58 PDT 2006 Kevin Karplus Geore changed the Makefile to use a local copy of Make.main WITHOUT LEAVING ANY COMMENT IN THE README FILE about doing it. Creating a new Make.main is a very bad idea, as we will be later using the Makefile to do evaluation of the results, and the local Make.main will be out of sync with the rest of the targets. The *right* way to override make target definitions is to provide new definitions in the Makefile, before or after the include, as appropriate. Now I have to clean up the mess, as well as trying to get a try4 run started. Thu Jul 27 04:49:55 PDT 2006 Kevin Karplus I found two changes that George had made in Make.main (as well as several places where he had missed recent changes to the real Make.main). I incorporated his changes into Make.main: definitions for ${RR_EXT} being 449a_45t2k --factor not forced to be 0.05 For the --factor parameter, I created an RR_FACTOR macro whose default value is 0.05, but which can be reset (before the include) in the Makefile. I also set it for George in the local Makefile. George's Make.main has been moved to obsolete-Make.main try4 started on lopez to try to keep the distances I listed above. Constraint L219.CB V138.CB -10 4.4 7.0 0.5 Constraint L219.CB I139.CB -10 4.7 8.0 0.5 Constraint V215.CB A135.CB -10 3.8 7.0 0.5 Constraint E213.CB M161.CB -10 5.2 8.0 0.5 Constraint A209.CB I168.CB -10 4.2 7.0 0.5 Constraint A209.CB A131.CB -10 5.5 8.0 0.5 Constraint A209.CB L165.CB -10 5.5 8.0 0.5 Constraint P114.CB D194.CB -10 3.8 7.0 0.5 Constraint P114.CB V195.CB -10 6.4 9.0 0.5 Constraint P120.CB L175.CB -10 4.9 8.0 0.5 Constraint L134.CB M161.CB -10 6.0 9.0 0.5 Constraint V33.CB V132.CB -10 5.1 8.0 0.5 Constraint L88.CB I139.CB -10 6.1 9.0 0.5 These constraints were extracted from the top alignment in T0365.undertaker-align.pdb, from alignment T0365-1sumB-t06-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m Let's see if they hold the bundle together. They may be too few, or too weak, or in the wrong places, but it would be fairly easy to automate the extraction of a similar set, so the experiment is worth trying. Thu Jul 27 07:46:14 PDT 2006 Kevin Karplus The constraints in try4.costfcn did hold the bundle together, but there are a few bad breaks. T0365.try4-opt2.pdb.gz has 12 breaks T0365.try4-opt2.pdb.gz breaks before (T0365)L86 with cost 3.65678 T0365.try4-opt2.pdb.gz breaks before (T0365)P192 with cost 3.55395 T0365.try4-opt2.pdb.gz breaks before (T0365)F77 with cost 3.45142 T0365.try4-opt2.pdb.gz breaks before (T0365)I162 with cost 1.50134 T0365.try4-opt2.pdb.gz breaks before (T0365)I122 with cost 1.4177 T0365.try4-opt2.pdb.gz breaks before (T0365)I15 with cost 1.03962 T0365.try4-opt2.pdb.gz breaks before (T0365)L165 with cost 0.977576 Thu Jul 27 07:58:19 PDT 2006 Kevin Karplus I just noticed that someone had edited try1.under after it had been run. This is a serious error. We want to keep records of what we have done, and changing the inputs after running the program destroys the record. There isn't even a backup file, so I suspect that is was not edited with emacs leading me to suspect that George changed the file. ALWAYS CREATE NEW COPIES BEFORE EDITING---DON'T CHANGE THE OLD FILES. I think, since it was try1, was can probably recover the original from my computer at home. I will rename the mangled try1.under to mangled-try1.under. I re-generated try1.under to get the original text back, but I'll need to do a transfer to get the original date back. Thu Jul 27 08:01:50 PDT 2006 Kevin Karplus try5 started on lopez to try to polish up try4 and close gaps. It actually does a read-pdb.under, but try4 scores much better than previous tries with try5.costfcn It also scores slightly better than previous tries with the unconstrained costfcn, despite the breaks. Thu Jul 27 10:45:50 PDT 2006 George Shackelford Try5 does help to close the breaks but there is still one break to close. One more polishing should get that. I am going to use emacs to edit these files as Kevin has suggested (just have to be careful about keyboard commands and hope I'm not cut off from the server). I've turned on auto-fill. I think it should work now. I don't know where the the change came from to try1. I don't recall making any changes. It appears from comparing mangled with the correct try1 that I must have accidentally copied back over the orginal try1. I don't understand. Why isn't auto-fill working? I'm assuming that the metakey is 'escape'... Ok, I think that worked. Although phobic_fit seems high, the near and burial displays with spacefill show a credible job of burial. The helices on the ends have indications that they have no real burial or near and could be sticking out into the water. I don't think that is a problem. I'm going to do one more run starting with try5 gromacs.repack and see if we can't close the gaps. I'm also going to modify the constraints so the first helix is continous to get rid of that break at the beginning helix by starting it at V3. It does not need to bend upwards; there is no indication that it does. In fact, that may have contributed to the break. I'm wondering about relaxing the hand constraints so undertaker has some flexibility to heal the breaks and perhaps shift the helices a little to make it work better? Well, not for this try. try6 running on peep. Thu Jul 27 16:36:17 PDT 2006 George Shackelford Ok, try6 does better. I think it's time to let the models speak for themselves. I'm turning constraints off (there is some possibility they have prevented a complete healing of the break by preventing a touch of turning). Try7 running on peep. Thu Jul 27 18:53:32 PDT 2006 George Shackelford Try7 finished and scored the best. Time is getting late so I've put together the superimpose and built best-models. The ones I've selected: Model 1 Try7-opt2.pdb is a polished and somewhat healed version of try4 which is based on 1sumB. The six helices form a nice bundle, wrapping around each other. Model 2 Try7-opt2.gromacs0.repack-nonPC.pdb is the best scoring version using Rosetta scoring. It is included for that reason otherwise it is another version of try7. Model 3 Try1-opt2.pdb NOTE: an alternative would be to use the gromacs-repack. This is what we get from the initial run. Actually scores very well using Rosetta. Model 4 Try3-opt2.pdb is a very kinky model as an alternative to the six bundle helix. NOTE: We could put try2 in here instead. At least it looks like a protein, but try3 scores better both unconstrained and in Rosetta. Model 5 SAM-t06_TS1. Nuff said. Fri Jul 28 17:33:37 PDT 2006 Kevin Karplus I restored the missing try1.under from my machine at home. Well, George put together some models, but didn't do a submission, so we missed the soft deadline. We can still make the hard deadline though. I'm wondering if we should try building some bundles with different constraints, perhaps from a different alignment. There are some parts of try7 that I don't like: Turn at D46 in stead of predicted location near T41 Break before L86 Unraveling helix from Q115 to I122 Kink in helix near K160 The unraveled helix looks easy to fix in theory, but may be difficult to manage with our current tools, as we would need to insert breaks around I113-I122, then reform the helix before closing the gaps. Doing that may be possible with undertaker, if we first make some deliberate gaps, and put in a really strong helix constraint. Fri Jul 28 18:00:33 PDT 2006 Kevin Karplus I created T0365.try7+breaks.pdb from try7-opt2.gromacs0.repack-nonPC by moving Q110-I122 (-10,0,0) then moving P114-Q118 an additional (0,10,0) to insert breaks. This has also made a mess of clashes, but I think that undertaker may be able to sort it out without just putting it back the way it was. Fri Jul 28 18:25:55 PDT 2006 Kevin Karplus I think that there is also a misalignment for V48-L70 that might be fixable by shift the alignment so that W44 is where V48. Doing this would probably require starting over from alignments with new distance constraints. I'll look into that after try8 runs. Fri Jul 28 18:36:39 PDT 2006 Kevin Karplus try8 started on cheep, trying to fix the breaks in try7+breaks, in an attempt to fix the stretched out helix at Q115-I122. Fri Jul 28 19:58:46 PDT 2006 Kevin Karplus Well, try8 attempted to fix the unraveled helix, but did so by unwinding the helix between Q126 and I129, leaving a bit of a void. Nothing else was badly damaged though (the helices did not fly apart, despite a lack of any contraints to hold them in place). I'll try optimizing from the same starting point again, with a somewhat longer helix constraint. I'll also add a Strand constraint for Q110-L112 to try to keep the helix from forming there. I'll also try adding some Hbonds: From o_notor2 and n_sep Hbond P114.O L117.N Weaker Hbond V106.O L111.N Hbond I107.O Q110.N These should help fix the end of the helix. Fri Jul 28 20:20:43 PDT 2006 Kevin Karplus try9 started on cheep. Sat Jul 29 05:28:38 PDT 2006 Kevin Karplus Unfortunately, try9 opened things up a bit between the first and second 3-helix bundles, so I don't like it much, despite how small the breaks are. Sat Jul 29 07:06:46 PDT 2006 Kevin Karplus I tried putting together a new costfcn (try10) which has distance constraints from try7 and try8, with the helix constraints for what we were trying to get around P114 as in try9 (though not quite as strong). I also modified the distance constraints to move the first helix by 4. I'm trying again from the alignments, to get a different 6-helix bundle. Sat Jul 29 08:32:38 PDT 2006 Kevin Karplus try10-opt2 does not look too bad. There is a bit of a problem with the helices near L85, F121, and F184. The constraints I added to move helix1 are mostly not satisfied. Looking at try10-opt2, I think I was trying to move the helix the wrong way! For try11, I'll try shifting the first helix the other way, and strengthen the helix constraints in the regions where predicted helices were disrupted. Sat Jul 29 08:57:59 PDT 2006 Kevin Karplus try11 started on cheep Sat Jul 29 09:55:21 PDT 2006 George Shackelford Kevin's goal is a good one, but in score-all.unconstrained while try7 comes out on top, try8 drops down, try9 drops further, try10 even further. Perhaps we should stop when we're ahead. At least try7 becomes either Model 1 or Model 2. Sat Jul 29 10:14:07 PDT 2006 Kevin Karplus try11 messed up, letting the helix F157-L183 separate from the rest. I need constraints added (or strengthened) for V158, L165, T172, L179, and L183. I agree that the newer models are not really doing better yet. Perhaps I can make a chimera of try8 or try9 with try11, to get a model closer to what we want. Sat Jul 29 10:22:40 PDT 2006 George Shackelford If it is a different fold you want, consider something based on BACTERIORHODOPSIN as listed in the following four proteins. Rather than two three-helix domains, this is a circle of helices. I have some time before taking my wife down to Asilomar(!) today so I may see if I can get a fold like this. # program: alphabetmatch # George Shackelford # # Target: T0365 # length: 226 # length range: 210 to 248 # alphabets used: # ehl2 burial # id score per residue 5S 10N 10N 1c3wA 588.509 2.60402 1.20.1070.10-222 1kgbA 587.77 2.60075 1.20.1070.10-222 1jgjA 578.155 2.55821 1.20.1070.10-217 1h68A 572.276 2.5322 1.20.1070.10-218 Sat Jul 29 10:36:13 PDT 2006 Kevin Karplus I'm not looking for a new fold, but a better packing of the 6 helices. Go ahead and try to create a new fold, though bacteriorhodopsin is a membrane protein and does not have the amphipathic helices or our target, so it seems like a poor choice of templates. I will continue trying to get a good packing of the 6-helix bundle. My latest attempt is chimera-11-10, which is mainly from try11, but has residues D143-V195 copied from try10. It has bad breaks, but may be more fixable than try10 or try11 alone. Sat Jul 29 10:45:45 PDT 2006 Kevin Karplus try12 started on cheep to try to fix up chimera-11-10. Sat Jul 29 10:54:30 PDT 2006 George Shackelford Taking advantage of available processing time (or wasting it), I'll give those templates a run: try13 running on peep. Sat Jul 29 12:06:50 PDT 2006 Kevin Karplus try12-opt2 was not very successful---R151-P192 is not packed against the other helices. Perhaps I should now do a polishing run from all the models, hoping for some crossover. The big question is what the costfcn should emphasize. Perhaps I'll start with a run using the try12 costfcn, then try one with an unconstrained costfcn that has clashes turned up more. Sat Jul 29 12:12:29 PDT 2006 Kevin Karplus try14 started on cheep using same costfcn as try12, but attempting to polish all existing models. I may also want to generate a try8-try7 chimera, that is mainly try8, but uses try7 for A123-G133. Sat Jul 29 12:31:07 PDT 2006 Kevin Karplus chimera-8-7 created, using mostly try8, but try7 for A123-G133. Sat Jul 29 12:56:01 PDT 2006 Kevin Karplus try15 started on lopez to try to improve chimera-8-7.gromacs0 and chimera-8-7.gromacs0.repack-nonPC Sat Jul 29 15:41:14 PDT 2006 Kevin Karplus I think we're ready to submit---at lesat I'm giving up on getting any better models. I'll submit ReadConformPDB T0365.try14-opt2.pdb # best unconstrained ReadConformPDB T0365.try1-opt2.gromacs0.repack-nonPC.pdb # best rosetta ReadConformPDB T0365.try15-opt2.pdb # slightly different good model ReadConformPDB T0365.try9-opt2.pdb ReadConformPDB T0365.try12-opt2.pdb # different alignment of first helix Sat Jul 29 16:09:29 PDT 2006 Kevin Karplus Submitted with comment For T0365 we had good fold-recognition hits on 6-helix bundles, though the alignments were not clear (shifts of 4 on each helix were possible). We did not have the time nor the energy to explore the different shifts of the helices, and quite likely have been working on models with at least one of the helices off by a full turn. Model 1 is try14-opt2, the best-scoring model with several of our cost functions. It was polished by undertaker from try7-opt2, from try6-opt2, from try5-opt2, from try4-opt2, from alignments (last one 1ku1A, though that was probably only one or two helices). Model 2 is try1-opt2.gromacs0.repack-nonPC. This is the automatically generated model, re-optimized by gromacs and with sidechains repacked by rosetta. Rosetta likes it best of all the backbones it repacked, but we do not like the departure from the 6-helix bundle that we got as fold-recognition hits. The last alignment inserted in the try1 run was 1oq9A. Model 3 is try15-opt2, optimized by undertaker from chimera-8-7.gromacs0, a gromacs optimization of a chimera made mainly from try8-opt2, but with A123-G133 taken from try7-opt2. The try7-opt2 model was optimized by undertaker from try6-opt2, from try5-opt2, from try4-opt2, from alignments (last one 1ku1A, though that was probably only one or two helices). The try8-opt2 model was optimized by undertaker from try7+breaks, which was the try7-opt2 model with some short segments around Q110-I122 moved around to create breaks to increase the flexibility for undertaker to close gaps. Model 4 is try9-opt2, optimized by undertaker from try7+breaks, but with a different cost function. Model 5 is try12-opt2, optimized by undertaker from a chimera of try11-opt2 and try10-opt2. The chimera comes mainly from try11-opt2, but has residues D143-V195 copied from try10-opt2, since that helix was not packed into the bundle in try11-opt2. Model try11-opt2 was optimized by undertaker from alignments, with the last alignment inserted coming from 1afrA. Model try10-opt2 was optimized by undertaker from alignments, with the last alignment inserted coming from 1ku1A. Sun Jul 30 11:14:05 PDT 2006 George Shackelford I found a bug in the building of the database I use for "alphabetmatch." After fixing it, the database is now four times larger. 1sumB which had not shown up in earlier searches was now part of the database and shows up as the second best choice. The first is 1xwmA and I used that as the only template for try16 (run on peep). Try16 scoring decently; it might score better with some polishing. It matches our helix predictions better than the models based on 1sumB. If I can get it in, I will include the following comment: Model 4 is try16-opt2 which is a run based on a global alignment with 1xwmA. 1xwmA was the best scoring hit from the distant fold-recognition program "alphabetmatch."