Wed Jun 28 10:17:31 PDT 2006 T0351 Make started Wed Jun 28 10:19:22 PDT 2006 Running on lopez.cse.ucsc.edu Wed Jun 28 17:41:50 PDT 2006 Kevin Karplus BLAST found nothing in PDB (best Evalue =1, 1hciA) The HMMS found nothing (best Evalue=3.3 1exrA) The top few alignments are all different structures. About all we have is that this is mostly helical. There are a couple of short strands predicted, but none of the alignments got strands for them (and neither did try1-opt2). Wed Jul 5 08:25:05 PDT 2006 Kevin Karplus I picked up the server models, but scoring them with unconstrained.costfcn on the farm cluster seems to have failed. I'll try again on cheep. It is crashing on panther3_TS1, which has a problem with its first residue (L68). All its atoms are on the X-axis: ATOM 1 CA LEU 68 0.477 0.000 0.000 1.00 10.00 ATOM 2 N LEU 68 -0.913 0.000 0.000 1.00 10.00 ATOM 5 C LEU 68 1.971 0.000 0.000 1.00 10.00 ATOM 6 O LEU 68 3.249 0.000 0.000 1.00 10.00 ATOM 7 CB LEU 68 -1.030 0.000 0.000 1.00 10.00 This causes problems with the coplanar_trans to locate the spots for burial computations, as the three points that are supposed to define the plane (CA, N, C) are colinear, so the plane is indeterminate. Wed Jul 5 10:20:13 PDT 2006 Kevin Karplus I added a couple of checks for colinearity of atoms before the calls to xyplane or coplanar_trans, and I think that undertaker can now handle the flawed panther3_TS1 file. Wed Jul 5 10:44:19 PDT 2006 Kevin Karplus Best-scoring server is SAM_T06_server_TS1, followed by ROBETTA_TS2, Pmodeller6_TS2=ROBETTA_TS1, ROBETTA_TS4, ... Thu Jul 13 22:16:15 PDT 2006 George Shackelford When I looked at the score-all+servers.unconstrained.pretty I saw that the scores looked rather odd, starting out with negative values and increasing slowly. The score-all.unconstrained.pretty looks ok (after rerunning it) with values in the 160's and up. I'm going to rerun score-all+servers. After the rerun, we get SAM_T06_server_TS1, T0351.try1-opt2.pdb.gz, SAM_T06_server_TS1-scwrl, the rest of the try1's, RAPTORESS_TS2-scwrl, RAPTOR_TS2-scwrl, ROBETTA_TS5, ROBETTA_TS2. The scores are up to 200 by then. Our scoring is probably not too great. I wonder how we did comparatively with our scoring from CASP6? RAPTOR(ESS) chose a three helix bundle. That's safe and maybe right. ROBETTA_TS5 looks like it's trying for a knuckle. TS2 is a bundle with a sheet! Looking at try1.log, I find that it is based on T0351.try1-al7+all-align.a2m:1rypA. It looks like a "knuckle." I kinda like it though its probably wrong. Sat Jul 15 09:40:38 PDT 2006 Kevin Karplus It would probably be a good idea to polish up some of the server models. You could also try using VAST to find structural homologs for them, then do a prediction run with just a few of those homologs as templates. (using different sets for the bundle, knuckle, and bundle-with-sheet possibilities). Sun Jul 16 18:20:20 PDT 2006 George Shackelford First I would like to see if the secondaries are likely to be good. The .t04 is almost the same as the t06 so I'm discarding it. Both t06 and t2k really like that tryptophan. Big deal. Both pick up on some prolines with t06 suggesting some possible conserved luesines. I don't buy those since leucines would occur frequently as buried in what is almost all helices. However t2k does pick out some gycines. These MIGHT be structural (along with the prolines) but the signals aren't real strong. Almost all of the conserved signals are in the first half of the sequence; could this indicate two domains? If so, then a simple helical bundle may not be the correct solution. This is going to be tough. I'll take a look for those VAST structures that Kevin suggested though we may already have them among our distant hits. Sun Jul 16 21:54:09 PDT 2006 George Shackelford I looked at the t06 and t2k str2 logos. There is a clear indication of small hairpin sheet, which try1 picks up on. This structure is a bit more complicated than it first appears. There is an indicaton of a helical cap (or start) that we may be able to use from the t06 n_notor and o_notor logos. I'd love to get something useful out of those. On to VAST. Mon Jul 17 11:24:29 PDT 2006 George Shackelford ROBETTA_TS2 Your VAST Search job was submitted at 07/17/2006 14:23:24(EDT). Request ID: 1096461764619656611 getting some nice hits. PDB CD AliLen. SCORE P-VAL RMSD %Id Description 1BF5A 1 105 4.0 0.0456 2.2 6.7 Stat-1 Dna Complex 1LVFA 98 4.0 0.0039 3.0 11.2 Syntaxin 6 2C5IT 93 3.6 0.0177 2.8 4.3 N-Terminal Domain Of Tlg1 Complexed With N-Terminus Of Vps51 In Distorted Conformation 2C5KT 85 3.4 0.0283 2.3 4.7 N-Terminal Domain Of Tlg1 Complexed With N-Terminus Of Vps51 1VCSA 84 4.0 0.0047 4.1 3.6 Solution Structure Of Rsgi Ruh-009, An N-Terminal Domain Of Vti1a [mus Musculus] 1HS7A 74 4.0 0.0156 2.7 6.8 Vam3p N-Terminal Domain Solution Structure 1WCRA 72 3.9 0.0064 2.4 5.6 Trimeric Structure Of The Enzyme Iia From Escherichia Coli Phosphotransferase System 2CRBA 70 3.5 0.0208 2.4 10.0 Solution Structure Of Mit Domain From Mouse Nrbf-2˙ 2FZTA 62 3.9 0.0056 1.5 8.1 Crystal Structure Of Hypothetical Protein (Tm0693) From Thermotoga Maritima At 2.05 A Resolution 2CW0A 1 59 3.5 0.0218 2.5 6.8 Crystal Structure Of Rna Silencing Suppressor P21 From Beet Yellows Virus˙ 1UG0A 57 3.6 0.0184 2.9 5.3 Solution Structure Of The First Murine Bag Domain Of Bcl2- Associated Athanogene 5 2D2SA 1 57 4.0 0.0181 2.6 7.0 Crystal Structure Of The Exo84p C-Terminal Domains˙ 2C2LA 2 49 3.8 0.0079 2.4 4.1 Crystal Structure Of The Chip U-Box E3 Ubiquitigase˙ 1UT9A 4 47 3.8 0.0329 2.0 4.3 Structural Basis For The Exocellulase Activity Of The Cellobiohydrolase Cbha From C. Thermocellum 1QQEA 2 47 3.3 0.0445 2.1 2.1 Crystal Structure Of The Vesicular Transport Protein Sec17 ROBETTA_TS5 Your VAST Search job was submitted at 07/17/2006 14:25:41(EDT). Request ID: 81504435609357331 NO hits with med. redundancy. Trying all PDB. Your VAST Search job was submitted at 07/17/2006 14:46:38(EDT). Request ID: 297711921389763120 NO hits. So we would do well to focus on a bundle with a little sheet. ROBETTA_TS2 doesn't do well with our ehl2, so we can take the current sheet constraint, the t06.str2 constraints, tone down the rr.constraints, and work from a set of the best scores we have now. Another try could use the ROBETTA_TS2 VAST set to form what we want. Another try may use some of the distant fold-recognition strategy but I don't think we're getting much. Make started Mon Jul 17 14:25:14 PDT 2006 Running on shaw.cse.ucsc.edu Mon Jul 17 15:19:23 PDT 2006 George Shackelford try2 costfcn: # include T0351.dssp-ehl2.constraints include T0351.t06.str2.constraints # sheet designed by: echo "20 25 33 38" | build_sheet.py -t T0351 -o down SheetConstraint F20 D25 K33 G28 hbond E21 1.0 # include T0351.undertaker-align.sheets include rr.0.1.constraints commented out # Include T0351.t04.undertaker-align.under try2 running on peep Working on using the VAST matches as possible alignments: making extra_alignments and read_alignments try3 based on the VAST alignments to ROBETTA_TS2 try3 running on shaw So I can't resist running a try based on possible distant fold-recognitions: 1i4mA 151.09 1.29137 1.10.790.10-108 1d9cA 148.642 1.27044 1.20.1250.10-121 1fjgI 144.071 1.23137 3.30.230.10-127 1fjgM 142.976 1.22202 ,1.10.8.50-71,4.10.910.10-54 1rfbA 140.66 1.20222 1.20.1250.10-119 1aoiC 139.594 1.19311 1.10.20.10-115 1ezjA 139.575 1.19295 ,1.10.287.320-62,1.10.287.340-52 1hulA 138.093 1.18028 1.20.1250.10-108 1a7vA 136.588 1.16742 1.20.120.10-125 1ijxA 136.505 1.16671 1.10.2000.10-125 long shots underway - using the same costfcn as tries try4 running on peep Mon Jul 17 16:44:56 PDT 2006 George Shackelford So try2 finished and it does meet the needs of constraints but it makes a bad doughnut of a model. Otherwise it scores pretty well. That's not encouraging. I'm going to see what it focused on. 1rypA perhaps? 1ufiA is in there but I don't quite buy it... The first returns on try3 show a similar problem to try2: there are two long helices which should come together and aren't. I should add a constraint to close the gap, or do something to bend them up. Kevin says I could crank up phobic fit. That would crumple these into a ball. That would be better. As I look at what try4 is coming up with, I think I might need to do changes to the t06 constraints and break up the long helix(ces) to see what I can get. I'm going to do that and try rerunning try4. Damn. I just realized I didn't turn off the ReadFragments line!!! Damn. I commented out the ReadFragments. I am rerunning try3 as try5 I am rerunning try4 as try6 Mon Jul 17 19:37:19 PDT 2006 George Shackelford Try5 is an improvement, but try6 is a disaster. I need to figure out what happened. try5 <- 1bf5A try6 <- 1ezjA, 1d9cA. 1bf5 is big. Really big. It is not clear which part try5 is picking up on. Both 1ezjA and 1d9cA do not represent sets of large helices, but the results is still mainly two large helices. There are three directions to go in. 1) The first is to take the try5 model and bring the two long helices together, with the remainder laying on top and the sheet facing downwards (hydrophobics downward). 2) The second is more difficult. Find templates that allow for the helices to bundle up into a five helix bundle. This means finding such a structure. What happens with the sheet is not clear, but we don't really seem to have any kind of model/template for now. 3) Roll the dice. Take out all the templates, TryAllAligns and let undertaker see what it can do. This is not a large protein. We may get lucky. For #1, we need to close the gap. I am going to reduce the rr.constraints to .05 while finding those that could bring it together and leaving them at full strength. Otherwise I'll have to resort to ProteinShop and move it all about myself. NOTE: I need to have constraints up the length of the helices else it will simply break them. Looking at the rr display in rasmol I find: L82 <-> L93. Hmmm. Not much more. I'll see if there are any 78 <-> 97 or like. From looking at it, it appears that the two helices may need to rotate upwards a bit. The hairpin is awkard. I don't think I have it right. I'll try flipping the way it faces around and get 24 matched to 27 (as it wants to be). SheetConstraint F20 N24 K33 N27 hbond F20 1.0 Below is apparently the best we can do with the rr.constraints. There really are not a lot of them. This should not be so surprising since the helices can adhere to each other without a lot of co-evolving residues. include rr.0.05.constraints # from the rr.constraints we get these. # we remove the bonus. Constraint L82.CB L98.CB -10. 7.0 14.0 0.322116290579 Constraint L82.CB I96.CB -10. 7.0 14.0 0.313963021864 Oops. I almost forgot. I decided to take out the include of 1bf5A! Might see what that causes. Nothing that we can't undo with a later run, but it might do some good... try7 running on shaw. Well, that takes care of #1. Should I just try #3 as well? It isn't real hard to set up. Removed all the includes and TryAllAlign. try8 running on shaw. Mon Jul 17 23:52:05 PDT 2006 George Shackelford I've just realized that there is a problem with a five helix bundle: there is not enough burial on the helices to support a bundle. The only real core is the small sheet and the joint at about 88-92. There are a few other patches indicating that the helices are touching. There is not enough to suggest (at the moment) that we could form a the five helix bundle. Meanwhile I'll keep doing some research into the main templates from those that we've already got (i.e. those with read_alignments). I went too far with try7; I need to restore 1bf5A at least. The helices did come together but pretty much everything else broke. I don't see that my change to the sheet is a problem. I'll step that back and see if we can get things working a bit better. So we step back (a bit) to try5. Surprisingly try8 did pretty well. Maybe if I can't get anywhere with working try5, I can start working with try8. Now I see why try7 had no difficulty with getting the helices together. 1lvfA is a simple three helix bundle. That also is probably why it broke as well! I notice that 2d2sA came in second to 1lvfA when looking at 'best' of try7. I should take a look at it. Meanwhile tracing back: tlvfA, tbf5A <- VAST of ROBETTA_TS2 try7 <- 1lvfA (<- try5 with 1bf5A commented out) try6 <- try4 <- try2 <- long shot alphabetmatch using ehl2,burial try9 <- try5 <- 1bf5A try8 <- only fragments... I wish I didn't have to do so much careful selection of templates but undertaker is still showing that it is most comfortable with fold-recognition (as observed by Kevin). try8 suggests that it can do decently on its own. If only I could tell it what I like about each template, but right now it selects one mainly based on the cost costraints. Because of the very weak burial signal, maybe I should search for ehl2,near and see if I have a distant template to try. W.r.t. approach #2, I found multi-helical templates in distant matches: there is 1af5, 1bgf, 1ijx. Try9 crushed the sheet and buried a helix between two helices. I don't buy it for a second. It got what it wanted from 1lvfA: try9 <- 1lvfA I'm taking it and 1bf5A out. try10 <- ???? <- try9 (w/o 1lvfA,1bf5A) try10 running on bark(!) overnight... Tue Jul 18 10:19:20 PDT 2006 George Shackelford Try10 looks good except for where it decides to fold upwards from the two helices. I can take the basis for try10 and rerun with the rr.constraints at full. That should hopefully bring that part down. Looking at try10.log, I can't be sure if it is coming from 2d2sA or 1ut9A. Both score almost the same with the same number of clashes. I see that the number of clashes in the best pick is 70 which is the same as 1wcrA. I'll look at them and see if I can figure it out. I am also going to make sure that there is enough gap in the 88-92 joint to separate the two helices. They seem way too close (or maybe not). I've made some adjustments to the helical constraints of t06.str2.constraints based on what I see in t06.n_notor2. try11 <- (with rr.constraints, increased 88-92 gap) <- try10 <- 2dsA,1ut9A (1wcrA?) Following up on the multi-helical approach #2, I'm going to isolate 1af5, 1bgf, and 1ijx, and do a run with just them. I'll use the rr.0.1.constraints. I've found that 1af5 and 1bgf came from a different run of alphabetmatch using different gapstart (.1) and gapextend (.4) costs. I'm adding them to the TOP_HITS. This is going to be quite differnt. try12 <- 1af5, 1bgf, 1ijx + rr.0.1.constraints XX<- try6 <- long shot Make started Tue Jul 18 12:57:53 PDT 2006 Running on shaw.cse.ucsc.edu Tue Jul 18 12:58:06 PDT 2006 George Shackelford Per Kevin's e-mail concerning a bug in Make.main, I am rerunning the make -k operation for T0351 on shaw. I'm restarting try12 on a diffrent processor; I don't think the change will really have a large impact Try11 did not perform as I had hoped. The full rr.constraints force the sheet between the the main two helices. Although try11 scores well, I actually like try10 better. Also the rr.constraint that helps bring the two helices together is just a "bonus" in the rr.constraints; I need to take the bonus out. I think I'll just pull all the rr.constraints into the *.costfcn and then I can adjust them as I wish. ---- # almost all of these constraints are for the sheet, # but I've taken care of the sheet above. # The constraint from I7 to L22 is of interest... Constraint I7.CB L22.CB -10. 7.0 14.0 0.0516539254325 # Although I am not using these next constraints, I notice that # they strongly suggest that I7 is near L3, implying a helix # Constraint I7.CB I31.CB -10. 7.0 14.0 0.363489614286 bonus # Constraint L3.CB Y30.CB -10. 7.0 14.0 0.360993079452 bonus # Constraint Y4.CB L22.CB -10. 7.0 14.0 0.326550097872 bonus # I adjusted the next two down by inserting a 1 after the decimal Constraint L82.CB L98.CB -10. 7.0 21.0 0.1322116290579 Constraint L82.CB I96.CB -10. 7.0 21.0 0.1313963021864 # some rr constraints that were not a part of rr.constraints # boosted by inserting a 1 after decimal Constraint L36.CB L82.CB -10. 7.0 21.0 .125 Constraint I31.CB I96.CB -10. 7.0 21.0 .124 ---- try13 <- modified costfcn - try11 Tue Jul 18 14:36:44 PDT 2006 George Shackelford try12 finished. It built a triangular structure. Interesting, wrong, but at least it's different and it gets a decent score. I think I'll find the template it used, remove it, and run again. I couldn't wait for the newest 'make -k' to finish; it runs so slow apparently due to a slow network. I've gone and started try13. try13 running on vashon. I may do another try8 where we start with nothing, use the latest costfcn (of try13) and see what we get again. Hopefully something different. In fact, I'm going to do that. try14 <- w/ new costfcn - no templates try14 running vashon Tue Jul 18 15:32:32 PDT 2006 George Shackelford try13 finished opt1 and it is looking like a disaster. I'm going to get started on what happened. I think that my new costfcn may have forced undertaker to choose a template that doesn't work. Looks like I'll have to comment out another one... try13's log file is corrupt. It is possible that the work that Garcia was doing disrupted the file. In that case, other files may be corrupted. I hope not. It also may be the results of rerunning 'make -k.' Now I can't tell how try13 developed unless I start it again and rerun until I get a trace on the best scores. Damm. Time is running out. I going to copy and restart try13 as try15. That way I at least have some history to fall back upon. I need to be prepared to break into try15 ASAP when I know what it is coming from. Now I find I am afraid to run the try13/try15 again. What if it screws up because of 'make -k'? And when can I expect that to finish?? try14-opt1 is finished and it is nice but not useful. I can also do try12 with a different template. That will at least give us a different model. Long shot run: try16 <- w/o 1bgf <- try12 <- 1bgf. I can't start the long shot until the extra_alignments and read_alignments has finished on the manual templates. I simply could not wait longer. I started try15 and found that the template it used was 1ut9A. I assumed that was also the template for try13, so I stopped try15, commented out the 1u9A and restarted it. so try15 <- w/o 1u9tA <- try13 I have found that there are errors in the run for try15. I am stopping it until the 'make -k' has finished. Tue Jul 18 19:06:42 PDT 2006 Kevin Karplus try13.log.gz is not completely corrupt---but a huge chunk of the beginning is all-zeros. Probably nothing to do with Jorge, as they were working on a different filesystem on a different file server. Could be a bad disk block. I see George is still generating Template.atoms files---I'll remove the one from the try15 run, since we still have a Template.atoms.gz from before. There has not been a new best-models.pdb.gz since June 28, which is just the automatic one. Is there really nothing else to submit for this target? I expect to see things in the superimpose-best.under and the README file by 8pm tonight, so that I can format and submit. Tue Jul 18 18:41:58 PDT 2006 George Shackelford 'make -k' finally finished. I have started try15 - again. Looking at what I have, try10-opt2 is closest to what I want. I may try putting it through "polishing" with some strong constrains and seeing if I can get that last part in place. <- try10 <- ???? <- try9 (w/o 1lvfA,1bf5A) ----- top unconstrained scoring -------- try9 scores well, but its only a bunch of helices. I don't like it. try4 scores well and can be used as an alternate. try2 just looks ugly. It has a bad break, and it looks very twisted and unnatural. try5 is actually nice because it has almost the shape we want. The helices are separated and the return helix is sitting down between them, otherwise it's what i want. try8 is a set of helices, looks a bit odd but makes a nice alternative try1 is an appealing "knuckle." Makes a great alternative. try10 is next and it is the closest to what I want. Can I make it fold down?? The top server model scores quite well. We should consider it as well. The current best choices: For all tries, I increased the weight of constraints to 15. -- These tries are based on the standard set of templates. ----- try5 was an initial run that used template 1bf5A. I included rr.0.1.constraints, replaced the dssp-ehl2 constraints with t06.str2. I added a sheet constraint for the single sheet. actually nice because it has almost the shape we want. The helices are separated and the return helix is sitting down between them, otherwise it's what i want. try9 used try5's *.costfcn and try5's *.under without 1bf5A. try9 came from 1lvfA. I also changed the sheet hbond from E21 to F20 to get the hairpin fixed. scores well, but its only a bunch of helices. I don't like it. try10 used try9's *.costfcn with no changes. used try9's *.under with 1ivfA removed. I THINK try10 comes from 1ut9A maybe 2d2sA. Very close to what I want but the section with the sheet needs to fold down. try11 used everything about try10 except it used the full-blown rr.constraints. Pushed the two helices apart and buried the rest between them. The best scoring of --- This try is based on no templates, just the fragments. ------- try8 *.under is the original without any templates, just the last fragments. "Rolling the dice." The costfcn is like try5's but has the constraint for pulling the helices together (makes a core). Uses rr.constraints scaled by 0.05. --- These tries are based on possible distant fold recognition. I anticipate using these as alternative folds. I really think try5 and try10 are the best (so far). try4 based on the ReadFragments line instead of the alternative templates. At least it scores well, and we could use it as an alternative. also: try1 The Original. Would make a great alternative. Looks like a triangle! going to put try10, try11, try5, try1, try4 in the superimpose As I stated earlier, the reason I like the model I was pushing for in try5, try10, and try11 is due to the information I gleamed primarily from the near-backbone-11 data. It suggests that most of the helices are relatively exposed and that there is only a patch at the 88-92 junction that matches well with the patch that is part of the small sheet. rr constraints basically agree, and str2, n_notor and o_notor show where the breaks should be. There is a possiblity that there is a more complex shape with more breaks in the helices but I could not find a good match either in our standard templates or in my more distant templates. Tue Jul 18 21:49:53 PDT 2006 Kevin Karplus I don't particularly care for any of the models generated, but I've submitted ReadConformPDB T0351.try10-opt2.pdb ReadConformPDB T0351.try11-opt2.pdb ReadConformPDB T0351.try5-opt2.pdb ReadConformPDB T0351.try1-opt2.pdb ReadConformPDB T0351.try4-opt2.pdb I think that this model might have benefited from a little more ab initio and less fold-recognition work---once the sheet and helices had been formed, moving the helices around in ProteinShop might have been more productive than looking for more templates. E-mail me early if another submission is needed in the morning. Wed Jul 19 11:23:18 PDT 2006 George Shackelford I actually tried using ProteinShop but found it so clumsy that I couldn't get anything done. Perhaps someone like Firas whose worked more with it could get something going. What I found frustrating was not searching for new templates but trying to exclude templates which "take charge" and push the model away from the desired structure. What is even more discouraging is demonstrated in try16. I took the try5.under which achieved a good start towards the desired structure, and gave it the try13.costfcn which had the best-tuned constraints and the results was an unexpected set of helices. Specifically I had a break between helices in the 84-91 region. This break disappears in this model. Maybe it would help if we had a "coil" constraint (no helical h-bonds). The model took advantage of other allowed breaks; I could close those and undertaker would likely use another template - or another or another. Our constraints don't generate fragments though perhaps they could. Such large fragments would actually be easier (I think) for undertaker to manipulate. Sat Sep 9 17:40:45 PDT 2006 Kevin Karplus The evaluation of this model seems to be a bit thrown off by how much of the target was not solved---only residues 1-60 are in the solved NMR structure. Sat Sep 9 18:06:19 PDT 2006 Kevin Karplus The problem was that undertaker was report 0 RMSD for models that had no overlap with the experimental model---a ridiculuous thing to do. After fixing that, our best model was try10-opt2.gromacs0.repack-nonPC, very similar to our best submitted (model1=try10-opt2). We did pretty well on this one. There does not seem to be enough penalty for missing atoms, as the rather poor alignment UNI-EID_sfst_AL1 scores well because of low RMSD, though it is missing a *lot* and has low GDT.