Thu Jun 29 09:52:31 PDT 2006 T0353 Make started Thu Jun 29 09:53:05 PDT 2006 Running on cheep.cse.ucsc.edu Thu Jun 29 10:03:24 PDT 2006 Kevin Karplus BLAST finds no hits in PDB. Best is 2fug1 (29% over 49 residues, E-value 1.7) 85 residues, may need new-fold techniques. Fri Jun 30 12:16:45 PDT 2006 Kevin Karplus HMMs find no hits, best is 1y8qA with E-value 2.5 The try1-opt2 model looks somewhat plausible (better than the alignments) There may be CYS and HIS clustering needed. It appears from the try1.log file to be based on 1jg5A, a pentamer. H19, C55, C58, H59, C84 are alll somewhat conserved. We may want to make a pentameric model to do further optimization. Mon Jul 17 17:15:56 PDT 2006 Martin Madera 85 residues. T06 shows three conserved cysteines and two histidines (as Kevin noted), and two aspartates. All three HMMs are quite similar. Secondary structure predictions: a nice clear anti-strand - anti-strand - helix - ??? - helix - strand The last strand could be either parallel or anti-parallel, there isn't much signal. The ??? region contains two cysteins and a histidine, all three conserved, and a tyrosine / phenylalanine (both aromatic). Try1 made 56-59 into a strand, but it's supposed to be a helix. So try2: same as try1 but with a stronger constraint on the helix. Make started Tue Jul 18 12:28:04 PDT 2006 Running on shaw.cse.ucsc.edu ... re-run the make to correct alignment bug in the Makefile Tue Jul 18 14:04:33 PDT 2006 Martin Madera Actually 54-59 is the CRFCH loop. There are no predictions, really, the helix prediction is incredibly weak. Ignore what I wrote above. Make started Tue Jul 18 14:28:52 PDT 2006 Running on shaw.cse.ucsc.edu ... the make got stuck on a chmod -- probably something to do with the RAID problem with /projects/compbio/usr. Killed the make and restarted. What I think may be the functional residues: select His19,Cys55,Cys58,His59 color green wireframe 0.1 spacefill 0.4 So the loop is roughly where it ought to be, the two histidines are reasonably close (but they could be closer)... but it's clear that if the cysteines are involved, the loop can't be a strand, because the cysteines are on the opposite side from the histidine. Burial. T0353.t06.CB_burial_14_7-color.rasmol seems to think that the long strand should be completely buried! I need to look into the pentamer Kevin mentioned. T0353.t06.near-backbone-11-color.rasmol partially disagrees, the outer side of the strand is green rather than red/brown... but still no blue. So there's clearly something going on. Both scripts like the way the helices pack towards the sheet, which is good news. Best-models shows some agreement on the basic fold. I wonder what the servers think... aaaand... the top 5 models are all from Robetta! Our packing, and burial of hydrophobic residues (judged by 'near') look better than any of the five Robetta models. Wed Jul 19 11:42:46 PDT 2006 Martin Madera I've looked at the 1jg5 pentamer and I'm not sure it's such a good idea, our protein looks very different. Now that the make got re-done, I think I'll do a few runs from alignments to see what else undertaker can come up with. Try2,try3 and try4 running on shaw and peep. Wed Jul 19 16:05:14 PDT 2006 Martin Madera Try1, try2 and try3 are virtually identical! Try4 failed to pack the helices against the sheet (which is hanging out in space). This must be coming from an alignment, but I don't think it's the one Kevin mentioned (1jg5A). Wed Jul 19 19:20:02 PDT 2006 Martin Madera After a lot of looking at the various structures mentioned in try1.log, I've come to the conclusion that it *is* a mangled up version of 1jg5A after all. What confused me was that in our protein the 3rd strand is parallel, whereas in 1jg5A it is anti-parallel. The correspondence is best seen by looking at 1jg5_monomer.pdb and coloring it by group: the C-terminal strand (red) is cut short, so the last pair of strands is swapped, resulting in a parallel connection. (At first glance the strand alignment seems to be: tryXX: orange - blue - lightblue 1jg5A: orange - red - blue and of course the topologies then look completely different, so I thought this couldn't possible be the template.) Unfortunately it's the three anti-parallel strands that form the core of the pentamer; I'm afraid that by cutting them short (which is determined by where the final helix is) we'll damage the interface. Well anyway, time to run the pentamer and see what happens. Which alignment are we using? I think the best way to find out is to do a superposition.... the CE alignment is: 1jg5A PYLLISTQIRMEVGPTMVGD-----------EHSDPELMQQLGASKRRVLGNNFYEYYVN try1-opt2 DTYVKAKD-----GHVMHFDVFTDVRDDKKAIEFAKQWLSSIGEEGATVTSEECRFCHSQ 1jg5A DPPRIVLDKLECRGFRVLSM try1-opt2 KAPDEVIEAIKQNGYFIYKM and I saved the PDB alignment file in align/align.pdb.gz. From the structures it's clear what Undertaker did with the template (model 1 / blue) and how it mangled up the sheet. The first (N-terminal) hairpin / strand pair is clearly copied from the C-terminal hairpin of the template! When you look at our str2 predictions, the reason for extending the second strand and the first helix becomes clear: that's what str2 predicts. In those parts that are conserved the superposition is actually very good. Given that this is essentially an ab initio target (unless we're extremely lucky and this is the right template), and that Undertaker actually stayed very close to the template, I don't think we'd gain much by running the pentamer. So I think I'll remove 1jg5A (and one more structure that is clearly related) from the list of alignments and will concentrate on trying to generate alternative models. ------------------------------------------------------------------- For the soft submission, I'm afraid there isn't anything beyond the automatic run. The hard deadline is on Friday ... ------------------------------------------------------------------- Wed Jul 19 21:56:53 PDT 2006 Kevin Karplus submission done of ReadConformPDB T0353.try1-opt2.pdb InFilePrefix ReadConformPDB T0353.undertaker-align.pdb model 1 # t2k 1y8qA ReadConformPDB T0353.undertaker-align.pdb model 2 # t2k 1jg5A ReadConformPDB T0353.undertaker-align.pdb model 3 # 1mdvA ReadConformPDB T0353.undertaker-align.pdb model 4 # 1wr5A Thu Jul 20 13:56:50 PDT 2006 Martin Madera Try5, try6: remove all alignments to the two d.205.1 templates, 1jg5A and 1is8K. Otherwise same as try1. I think it may be necessary to try and add more templates. Got all the PDB files mentioned in cut -f 1 *best-scores.rdb | sort | uniq and added them to MANUAL_TOP_HITS; did make extra_alignments make read_alignments [PS and added them to the .under files from try7 onwards] Thu Jul 20 14:46:39 PDT 2006 Martin Madera Try5 and try6 are both different, but it's failing to assemble the sheet correctly. This is good news, in a way, because it means I managed to get rid of the pentamer templates. So, sheet constraints: t06.n_sep: V10 U = +8 weak A12 Q = +4 weak M18 L = -8 weak t06.o_sep: V10 L = +8 mediocre H16 Q = -4 mediocre M18 U = -8 weak H19 V = -9 weak which means: V10.N ... M18.O w, A12.N ... H16.O w, M18.N ... V10.O w, V10.O ... M18.N ,m H16.O ... A12.N ,m M18.O ... V10.N ,w H19.O ... V10.N .... probably wrong, ignored giving V10.N ... M18.O w,w V10.O ... M18.N w,m A12.N ... H16.O w,m or SheetConstraint V10 A12 M18 H16 hbond V10 # quite strong to check: select 10,18 color red select 12,16 color blue ... well, try1-try4 all get this wrong! ---------------------------------------------------------------------- Now let's assume that the third strand comes in anti-parallel. This thing has to be able to exist as a monomer, albeit briefly. Apparently parallel sheets are less stable than anti-parallel, and so I assume that mixed sheets are even less stable. Now the first strand is longer than the second strand, which means that there are parts of it that need to H-bond to something else... the short third strand! Further, the part that's close to the turn has strong Y and Z predictions in t06.str2 (meaning that the third strand can't bond there), but the first part turns into As. Also, we already know that the second part (close to the turn) is already H-bonded to the second strand! Maybe I'm reading into this too much, but this line of thinking seems to suggest that the third strand comes in anti-parallel to the first part of the first strand. ---------------------------------------------------------------------- Now looking at the hydrophobicity patterns using T0353.t06.CB_burial_14_7-color.rasmol there seems to be a very good match between: 82 81 80 79 78 77 76 75 EX bu ex BU BU BU bu ex EX BU EX BU BU BU BU ex 02 03 04 05 06 07 08 09 and t06.str2 says that the strand is 77-81, which would indicate the following H-bonds: 82 81 80 79 78 77 76 75 EX bu ex BU BU BU bu ex :: :: :: EX BU EX BU BU BU BU ex 02 03 04 05 06 07 08 09 and the constraint: SheetConstraint F77 M81 D7 I3 hbond F77 # quite strong However, the parallel version is also vaguely possible, I should do at least one run with: SheetConstraint F77 M81 I3 D7 hbond F77 # quite strong and see how it looks. So, try7: anti-parallel constraints, all extra_alignments try8: anti-parallel constraints, all extra_alignments try9: papallel constraints on the third strand, all extra_alignments Thu Jul 20 19:12:05 PDT 2006 Martin Madera Try7 & try8 have the third strand paralled! Bloody hell. I also noticed that the way I included the extra alignments was incorrect... which means that the alignments it used were the same as in try5 and try6. Fixed, increased the anti-parallel constraint, extended the constraint for the first two strands based on try9, restarted. Try10 and try11 running. Thu Jul 20 21:13:39 PDT 2006 Kevin Karplus There is nothing new in superimpse-best.under, and Martin has left me no information I can use to descide which models, if any, should be submitted. Martin, you'll have to do the submission yourself. The instructions are in casp7/README You have to edit superimpose-best.under to have exactly 5 models, in the order you want them submitted, edit T0353.method to describe each of the 5 models (where they came from and why they were selected). Then run make casp_models If any of the models are just alignments, then edit the corresponding modelX.ts file to change the parent record to refer only to the template aligned ot. Then run make email I'm *not* staying up tonight to see what you do, but I'll try to check your work in the morning. Thu Jul 20 23:17:06 PDT 2006 Martin Madera Neither try10 nor try11 did the right thing. And the extra alignments STILL didn't get included correctly. Bloody hell. OK, another attempt: try12 and try13, with both sheet constraints set to 200. Let's see if undertaker can actually assemble that sheet for a change... Fri Jul 21 01:04:11 PDT 2006 Martin Madera Neither try12 nor try13 managed to make the third strand anti-parallel the way I want. They put the strand in roughly the right position, but can't get the hydrogen bonds right. I don't understand why, it's very frustrating. Fri Jul 21 02:26:39 PDT 2006 Martin Madera Enough thinking about this. The tries I like the most so far are: - try3: the best variation on the pentamer monomer according to our scoring function - try6: looks vaguely like a protein - try10, try13: will need some repacking in ProteinShop try14, try15: sanitizing decoys/try10-edit2.pdb.gz (no constraints). Running on orcas. try16: sanitizing try3-opt2; try17: sanitizing try6-opt2. Running on shaw. Fri Jul 21 03:22:37 PDT 2006 Martin Madera Nah, it's too difficult to do what I want to try13. So I'll just re-run it a la try16 & try17 = try18, try19. Running on lopez. Fri Jul 21 04:22:12 PDT 2006 Martin Madera Had "include" instead of "ReadConformPDB". Restarted 14-19. Fri Jul 21 06:06:42 PDT 2006 Martin Madera Submission time. According to the unconstrained cost function, the best models are: - try16-opt2 (besed on try3-opt2) - try3-opt2 - try17-opt2 (based on try6) - try19-opt2 (based on try13) - try15-opt2 (based on edited try10) Rosetta likes try14 (based on edited try10), which I think blew up(!), followed by try19. Not much of an agreement! Maybe this was to be expected, I think most of the changes that undertaker made were cosmetic optimizations and didn't actually improve the structure much. The models should be quite diverse, but probably all wrong. Updated best-models based mostly on the unconstrained cost function. Ah, try16 and try3 are virtually identical; no point submitting both. Rosetta also thinks they're almost identical. So submitting only try16. This means I should pick one more. There is a lot less diversity than I had expected; virtually all models agree on the helix and the two strands. The rest, predictably, is a mess. Going down the list of models that Rosetta likes and trying to generate diversity, try9 may be a good idea. Added as our last model, removed try3. My edits to the method file: --------------------------------------------------------------------- Model 1 is try16-opt2. It is almost identical to try3, which was an automatic run that ended up making only relatively minor adjustments to 1jg5A. 1jg5 is a pentamer, but in this case undertaker stayed very close to the template so pentamer optimization was deemed unnecessary to improve the monomer structure. Model 2 is try17-opt2, a slight re-optimization of try6. Try6 was an automatic run but with two templates in the d.205.1 SCOP superfamily removed (1jg5A is in d.205.1). It seems to be mostly fragment packing. Model 3 is try19-opt2, based in turn on try13. Here we have tried to make the predicted third strand anti-parallel to the first part of the first strand. Model 4 is try15-opt2, based on a version of try10 edited in ProteinShop to move one of the helices to a more sensible position. The second helix is supposed to be a loop, and the final helix was an anti-parallel strand in try10, but undertaker likes helices. Try10 was again an attempt to make the third strand anti-parallel to first part of the first strand. Model 5 is try9-opt2. This was an attempt to make the third strand parallel to the first part of the first strand. --------------------------------------------------------------------- Did make casp_models (which went fine, all five .ts files got created), and make_email, which also went fine. Fri Jul 21 08:07:11 PDT 2006 Kevin Karplus I noticed that two of the rosetta files were empty (one for try14, one for try19), messing up the grep-best-rosetta script. I removed the offending files and am recreating them with make T0353.do14 T0353.do19 If Rosetta's favorite matches on of the models that Martin submitted, I'll replace Martin's choice with the rosetta-preferred version of it. Rosetta's favorite (by a tiny margin) is try19-opt2.gromacs0.repack-nonPC which matches Martin's model 3. I'll replace it. Fri Jul 21 08:13:14 PDT 2006 Kevin Karplus Resubmission done. ReadConformPDB T0353.try16-opt2.pdb ReadConformPDB T0353.try17-opt2.pdb ReadConformPDB T0353.try19-opt2.gromacs0.repack-nonPC.pdb ReadConformPDB T0353.try15-opt2.pdb ReadConformPDB T0353.try9-opt2.pdb Sun Sep 24 12:19:17 PDT 2006 Kevin Karplus The Zhang server did well again, with the top models from Zhang-Server_TS2, ROBETTA_TS4, Zhang-Server_TS1, Zhang-Server_TS5, Zhang-Server_TS4, ROBETTA_TS1. Our best model was try12-opt2 (not submitted). Our best submitted was model5 (try9-opt2). Our model 1 was not great---middle of the road for servers, and not even as good as SAM_T06_server (which scored like our model3). This turned out to be a 4-strand antiparallel sheet, with strand order 3-1-2-4 We did not predict strand 3, and the 1-2 hairpin was upside down in model1, so that we were trying to build 2-1-4 instead of 3-1-2-4.