Wed May 31 09:37:29 PDT 2006 T0307 Make started Wed May 31 09:38:16 PDT 2006 Running on lopez.cse.ucsc.edu Wed May 31 11:49:44 PDT 2006 Kevin Karplus No good hits in BLAST (best is 1zchA at E-value 0.5) No good hits with HMMs either. This looks like it will be an ab-initio target. Wed May 31 14:44:41 PDT 2006 Kevin Karplus Top hit is 256bA at E-value 4.8 There is no consistency of fold among the top hits (except, perhaps, for several hits on a.39.1.5 starting at E-value 6.1) This will probably be an ab-initio target, unless the a.39.1.5 hit turns out to be real. Wed May 31 16:42:57 PDT 2006 Kevin Karplus The models from undertaker alignments seem to agree that the target is all-alpha, but not on any of the details. The try1-opt2 model is a bit foamy, with some of the amphipathic helices turned the wrong way. I think we'll have to play a bit with this one to get a reasonable packing. Tue Jun 20 16:33:35 PDT 2006 Martin Madera I like a.39.1, it has the same feel as the secondary structure predictions for this protein (which are independent of the template matches): lots of helices and a few short strands. a.39.1 is EF-hand, a well-known superfamily. From what I can see, domains in all families contain two (or more) EF-hand motifs, and many are dimers. According to SUPERFAMILY it's almost exclusively a eukaryotic superfamily, though there are some good matches here and there in a few bacteria (often as single proteins of about ~150aa -- T0307 is 133aa). Pfam and SUPERFAMILY don't show any matches to T0307 (unsurprisingly). This is what Pfam has to say about the EF-hand motif: "Many calcium-binding proteins belong to the same evolutionary family and share a type of calcium-binding domain known as the EF-hand. This type of domain consists of a twelve residue loop flanked on both side by a twelve residue alpha-helical domain. In an EF-hand loop the calcium ion is coordinated in a pentagonal bipyramidal configuration. The six residues involved in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, -X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding Ca (bidentate ligand)." I saved the Pfam sequence logo in decoys/EFhand_Pfam.png. The representative structures in Grant's ancient set are: 1k8uA a.39.1.2 1psrA a.39.1.2 2pvbA a.39.1.4 1qv1A a.39.1.5 1m45A a.39.1.5 1omrA a.39.1.5 1k94A a.39.1.8 According to the str2/T06 predictions, T0307 has three loop regions of about the right length: 1-17 31-43 93-102 1-17 is at the start of the protein and so lacks the N-helix. It also doesn't seem to match the logo. The other two loops contain pretty good matches to region 14-22 of the logo: 3 4 234567890 DGKIKPAEI ** *? ** 9 10 890123456 DGDIDDNEL ** ** ** * = good match to a conserved residue ? = bad match to a conserved residue ("what's going on?") (space = residue not conserved) The key glutamates are at positions 39 and 105 in T0307 (66 residues apart). Now it turns out that in 3D the two loops often form a small beta sheet: http://structbio.vanderbilt.edu/cabp_database/pic_gallery/confchange/retreat97.1.html and this is certainly the case for the three stuctures above that fall in our family (a.39.1.5). [Quoth Firas, upon seeing 1k9u (a.39.1.10) -- "Dude, check out this awesome shit!" The domain is formed by two interlocked chains, and the two beta sheets -at either end of the domain- are formed by one strand from each chain!] What made me think of this is that our loops are quite far apart in the sequence -- 66 residues -- which is more than any other protein in the whole fold I've looked at; the usual distance is around three helices, i.e. 35-45 residues. Also, most structures contain two sheets (i.e. four loops), at opposite ends of the domain / protein / complex. This is either through the domain having four loops (internal duplication), or through being a dimer. There's space for another loop around 60-70, and then at the start of the protein, around 7-17 or something. Hmmm, doesn't quite make sense. Whatever. Tue Jun 20 21:02:30 PDT 2006 Firas Khatib well, the soft deadline for this is in less than 15 hours so I don't think we can attempt a dimer. We do have until July 11th to work on that, however, but we should try to make a monomer that we like for tomorrow. Tue Jun 20 21:18:34 PDT 2006 Martin Madera OK, so we need to figure out which residues H-bond with which, to add it as a constraint for undertaker. I've taken this from 1hqv (a.39.1.8): restrict 107-115,143-151 center 110 The sequences in 1hqv are: 107 115 SGMIDKNEL ** ** ** 143 151 RGQIAFDDF ?* *? ** and the H-bonds in 1hqv are: I146.O ... I110.N I146.N ... I110.O and it's obviously an anti-parallel sheet. So we're after the isoleucines, i.e. I35 and I101 in T0307. Tue Jun 20 23:00:08 PDT 2006 Kevin Karplus Hmm, it looks like we have nothing at all to submit for the soft deadline. I'm a bit dubious about an EF-hand fold, as EF-hands are usually pretty easy to recognize by HMMs. Still, it is as good a lead as any. It might be worth putting in some EF hands in the MANUAL_TOP_HITS (preferably ones that don't score too horribly), doing make extra_alignments make read_alignments and the starting the TryAllAligns with only the InfilePrefix 1xxxX/ include read-alignments-scwrl.under scripts (moved before the first TryAllAlign, and the Includes and all-align.a2m alignments commented out). I'll do the "automatic" submission (try1-opt2 and the first 4 models from alignment). Tue Jun 20 23:10:04 PDT 2006 Kevin Karplus "automatic" submission done. Tue Jun 20 23:31:36 PDT 2006 Martin Madera Oops, just noticed a typo in try2.costfcn: Hbond I35.N I101.O 1 Hbond I35.N I101.N 1 should be I35.*O* on the second line! Try2 does what I wanted, kind of, except with I66 instead of I101. Will correct the typo and try again as try3, bumping up the weights to 100. Overall try2 isn't very good. Wed Jun 21 01:09:14 PDT 2006 Martin Madera Try3 ignored the constraint -- the Hbond weights were at 100 but I set the overall weight for constraints back to 10. (It was 40 for try2.) I'll try 50 (and 100 for the Hbond weight) and see if it finally does what I want! Wed Jun 21 02:52:13 PDT 2006 Martin Madera Try4 also ignored the constraint. Undertaker really doesn't like it. Maybe the way forward is to extend it to a few residues and turn it into a sheet constraint? Or do the weights need to be even stronger?! Have started try5 (with a sheet constraint instead of Hbond) and will go to bed. Firas, have a look at try2-5 (esp. try5) and see if they look any better than what Kevin submitted. Wed Jun 21 16:15:21 PDT 2006 Martin Try5 (with the sheet constraint) finally does what I want, except the helices are now all over the place. The way forward is to hand-edit the alignments. Mon Jul 3 15:37:39 PDT 2006 Firas Khatib in lab meeting, Kevin noticed that T2k seems to be working better than T06 so we should use those predictions instead. I will try to fold it up into a 6-helical bundle with Proteinshop Fri Jul 7 15:06:28 PDT 2006 Firas Khatib I am trying try7 on shaw, it is just a test using the T0307.t2k.dssp-ehl2.constraints with a 4-helix bundle (that I made in Proteinshop) to see how strong those constraints are. Fri Jul 7 16:43:44 PDT 2006 Firas Khatib I messed up try7 by writing: T0307.t2k.dssp-ehl2.constraints instead of include T0307.t2k.dssp-ehl2.constraints try8 is running on shaw Fri Jul 7 19:32:30 PDT 2006 Martin Madera Having a second look at this target. I agree that the t2k HMM looks the best. But look at the conserved residues in the t2k-w0.5 logo: it picks up precisely the loops I noticed earlier, D32-E39 and again D98-E105. And note how the profiles for the two loops look remarkably similar (and similar to region 14-21 of the Pfam logo, decoys/EFhand_Pfam.png). I think it's a fair bet that those two are close in space (simply from clustering of conserved functional residues) and run in opposite directions (from symmetry). I think the hydrogen bonds I came up with earlier, Hbond I35.N I101.O 1 Hbond I35.O I101.N 1 are as good a constraint as any, and I will concentrate on building a decent ab initio model that satisfies this constraint. Otherwise we should concentrate on t2k alignments, because the HMM does look much better than t04 or t06. I will do this by following Kevin's try10 from T0329. To quote from T0329/README: ----------------------------------------------------------------------- [I]t looks like the t06 alignment is consistently doing better than the t04 and t2k alignments. You might want to do an initial run, like try1, but with only alignments from the t06 HMMs. One way to do this would be to put all the reasonable templates (basically the top 10 or 20 hits in T0329.t06.best-scores.rdb) into MANUAL_TOP_HITS in the Makefile, do make extra_alignments make read_alignments foreach x (*/read-alignments-scwrl.under) grep -h t06 $x > $x:s/scwrl/t06-scwrl/ end Then include each of the read-alignments-t06-scwrl.under files to read in the alignments in the try.under script. I'll do this as try10 for T0329. ----------------------------------------------------------------------- I added all the templates from T0307.t2k.best-scores.rdb to the Makefile: 2aaoA 1a2xA 1uhnA 2mysB2aaoA 1a2xA 1uhnA 2mysB 1tn4 1ncx 1top 2ccmA 1mvwB 1pvaA 1jk0A 1afvA 1jipA 1go3F 1osa 2scpA 1aj4 1wdcB 1eupA 1yqyA and then did: make extra_alignments make read_alignments (tcsh) foreach x (*/read-alignments-scwrl.under) grep -h t2k $x > $x:s/scwrl/t2k-scwrl/ end and then for try9 I followed T0329/try10.under and .costfcn. Try9 running on peep. Fri Jul 7 22:04:12 PDT 2006 Martin Madera Oops, used t06 instead of t2k in try9.under! Corrected, restarted. Fri Jul 7 22:10:18 PDT 2006 Martin Madera Bug in the Makefile: "1afv\A" instead of "1afvA", so those alignments never got made. Corrected, re-ran, restarted try9. SIGH. Fri Jul 7 23:20:29 PDT 2006 Martin Madera Try9 finished, but the beta-bridge is in the wrong spot (as always) -- because I forgot to add the constraint to the cost function! Christ I am hopeless today. Try10 is the same as try9, except I have now added the constraint. However, I also remembered that Hbond constraints are much weaker than sheet constraints, so the constraint I actually added is: SheetConstraint (T0307)K34 (T0307)K36 (T0307)D102 (T0307)D100 hbond (T0307)I35 5 (copied from try5). Running on peep. Sat Jul 8 12:52:55 PDT 2006 Martin Madera Try10 did create the beta bridge, as well as another one at 87-121. But according to T0307.t2k.dssp-ehl2.constraints residue 87 is supposed to be in a helix. One way of killing the other bridge would be to increase the weight on the helix constraints. So try11 = try10 but with higher weights on the dssp-ehl2 constraints (copied into the .costfcn, set to 2). Try11 is going to be *very* ab initio, so I've decided to also turn down breaks (50->25) and soft_clashes (20->10); at this point all I want are the helices and the correct beta bridge. Running on peep. Sat Jul 8 16:45:36 PDT 2006 Martin Madera Try11 has the same extra bridge as try10. Sigh. Constraints 10 -> 20, weight for the helix that has residue 87 in it: 2->5. Running as try12 on peep. Sat Jul 8 18:14:31 PDT 2006 Martin Madera Try12 is beginning to look the way I want: just one beta bridge and lots of helices. Of course the helices are all over the place, but that will be the next step. Actually before I do that, about the beta bridge, I don't like the direction the two isoleucines (I35 and I101) are pointing (away from each other). Looking at 2aao (the top hit for t2k), restrict 37-51:A,70-85:A wireframe off backbone 100 center 37-51:A,70-85:A select ile colour green a suitable distance seems to be: ILE80A.CB-ILE44A.CB: 4.059 In try12 the CB-CB distance is 7.0A. So I have another constraint, Constraint I35.CB I101.CB 3.5 4.06 5.0 4 and I'll reduce the SheetConstraint weight to 3 (since it's saying essentially the same thing). Running as try13 on peep. Sat Jul 8 20:11:08 PDT 2006 Martin Madera Try13 has the two loops perpendicular instead of (anti-)parallel! But the isoleucines are now pointing in the same direction. Time to go back to 2aao and pick a few more constraints to make the loops parallel: Distance GLN43A.CA-ASP81A.CA: 4.837 Distance GLN43A.CA-ILE80A.CA: 5.610 Distance ILE44A.CA-ILE80A.CA: 4.919 Distance ILE44A.CA-THR79A.CA: 5.351 Distance THR45A.CA-THR79A.CA: 4.896 which translates into: Constraint K34.CA D102.CA 4.3 4.837 5.3 1 Constraint K34.CA I101.CA 5.1 5.610 6.1 1 Constraint I35.CA I101.CA 4.4 4.919 5.4 1 Constraint I35.CA D100.CA 4.9 5.351 5.9 1 Constraint K36.CA D100.CA 4.4 4.896 5.4 1 Added to the try14 cost function, running on peep. Sat Jul 8 21:50:42 PDT 2006 Martin Madera Try14 finished. The loops are still perpendicular! Let's see what the constraints look like: Constraint K34.CA D102.CA 4.3 4.837 5.3 1 ... 5.7 Constraint K34.CA I101.CA 5.1 5.610 6.1 1 ... 5.6 Constraint I35.CA I101.CA 4.4 4.919 5.4 1 ... 4.5 Constraint I35.CA D100.CA 4.9 5.351 5.9 1 ... 5.5 Constraint K36.CA D100.CA 4.4 4.896 5.4 1 ... 4.9 Damn, that isn't too bad! And yet it's perpendicular rather than parallel. I think I'll have to tighten up the regions, Constraint K34.CA D102.CA 4.74 4.84 4.94 1 Constraint K34.CA I101.CA 5.51 5.61 5.71 1 Constraint I35.CA I101.CA 4.82 4.92 5.02 1 Constraint I35.CA D100.CA 5.25 5.35 5.45 1 Constraint K36.CA D100.CA 4.80 4.90 5.00 1 and add new constraints for the ends: # currently 6.83 Constraint K36.CA D102.CA 9.04 9.14 9.24 2 # currently 4.46 Constraint K34.CA D100.CA 5.66 5.76 5.86 2 Running on peep as try15. Sun Jul 9 01:46:20 PDT 2006 Martin Madera Try15 looks the best so far (though that isn't saying much). The loops are still more penpendicular than parallel; the constraints look thus: Constraint K34.CA D102.CA 4.74 4.84 4.94 1 ... 4.84 Constraint K34.CA I101.CA 5.51 5.61 5.71 1 ... 5.62 Constraint I35.CA I101.CA 4.82 4.92 5.02 1 ... 4.92 Constraint I35.CA D100.CA 5.25 5.35 5.45 1 ... 4.84 !!! Constraint K36.CA D100.CA 4.80 4.90 5.00 1 ... 5.04 !! Constraint K36.CA D102.CA 9.04 9.14 9.24 2 ... 9.14 Constraint K34.CA D100.CA 5.66 5.76 5.86 2 ... 5.75 I think that's good enough for now; once the helices are in place (through more constraints), hopefully the loops will adjust. Now, the helices. If one looks at T0307.t2k.CB_burial_14_7-color.rasmol it's clear that the core of the protein is formed by the four helices that flank the two loops. (Which makes sense -- maybe a hint that I'm on the right track?) However, I get the impression that we really want a mirror image of the arrangement in try15, swapping the left and right loops, in order to make the hydrophobic residues face the core rather than the outside. So, constraints on positions of the helices. I shall boldly postulate that: A25, [I35,] M43, L91, [I101,] W109 are all within 10A of each other, generating the following constraints: Constraint A25.CA W109.CA 5.0 7.5 10.0 2 Constraint L91.CA W109.CA 5.0 7.5 10.0 2 Constraint L91.CA M43.CA 5.0 7.5 10.0 2 Constraint A25.CA M43.CA 5.0 7.5 10.0 2 Downgraded the I.CB constraint to weight 1, it isn't that important really. Running as try16 on peep. Sun Jul 9 11:53:53 PDT 2006 Martin Madera Try16 is beginning to look like a protein! Very exciting. The beta bridge finally looks the way it should, and there's a four-helix bundle underneath it. Unfortunately one of the helices in that bundle is the wrong one (and strongly hydrophilic; the right one, with a massive hydrophobic surface, is hanging out in space), that will need to be fixed. So, constraints: Constraint A25.CA W109.CA 5.0 7.5 10.0 2 ... 18.7 !!! Constraint L91.CA W109.CA 5.0 7.5 10.0 2 ... 9.5 Constraint L91.CA M43.CA 5.0 7.5 10.0 2 ... 9.5 Constraint A25.CA M43.CA 5.0 7.5 10.0 2 ... 12.6 !! Looking at the structure, 9.5 seems like a reasonable distance. So let me bump up weight and make the regions tighter: Constraint A25.CA W109.CA 8.0 9.0 10.0 3 Constraint L91.CA W109.CA 8.0 9.0 10.0 3 Constraint L91.CA M43.CA 8.0 9.0 10.0 3 Constraint A25.CA M43.CA 8.0 9.0 10.0 3 Running as try17 on peep. Sun Jul 9 15:16:32 PDT 2006 Kevin Karplus try17 doesn't look anything like try16. I suspect this didn't do what Martin wanted. Martin, could you put the best current models (in your opinion) into best-models.pdb.gz using superimpose-best.under? Sun Jul 9 15:52:02 PDT 2006 Martin Madera Indeed, try17 blew up. Sigh! My best model so far is try16-opt2, everything else from try9 onwards is a mess. My earlier attempts (before the current series starting at try9) were in the same general direction and are no good as CASP submissions. Not sure about Firas' models, haven't looked at them. We'll talk about them tomorrow. Now, try17. Hmmm. Maybe the distance region (8-10A) is too narrow and the costs are too steep (so it gets trapped in local minima??!). I think I'll go back to the try16 regions but double the cost, Constraint A25.CA W109.CA 5.0 7.5 10.0 4 Constraint L91.CA W109.CA 5.0 7.5 10.0 4 Constraint L91.CA M43.CA 5.0 7.5 10.0 4 Constraint A25.CA M43.CA 5.0 7.5 10.0 4 Hopefully with each run being random, I'll get lucky. Running on peep as try18. Sun Jul 9 16:18:23 PDT 2006 Martin Madera Updated best-models with my favourite models so far (minus Firas' models, which I haven't looked at yet). Sun Jul 9 21:25:50 PDT 2006 Martin Madera Try18 blew up as well. I think the way forward is to edit try16-opt2 with ProteinShop. Started working on it. Sun Jul 9 22:31:56 PDT 2006 Martin Madera Edited try16-opt2 and saved it as decoys/edit2.pdb.gz. Starting try19 on peep as an effort to improve edit2. Reset the four helix-helix distance constraints (4 -> 1) and the HelixConstraints (2 or 5 -> 1). Mon Jul 10 01:22:53 PDT 2006 Martin Madera Try19 was a disaster: 1) I forgot to change try18 -> try19 in the .under file, so it overwrote parts of try18. But that doesn't matter, because try18 was a disaster. 2) I forgot that I was still using low breaks and clashes, so it didn't do much in the way of refinement. 3) I need to play with the ProteinShop a bit more to make sure the extra helices don't interfere with the 4-helix bundle So, try20. Soft_clashes -> 20, break -> 50. Trying to improve edit3.pdb. Running on peep. Mon Jul 10 07:35:14 PDT 2006 Martin Madera Try20 doesn't look very well, the beta bridge is broken (why?!). Checking the helix distance constrains: Constraint A25.CA W109.CA 5.0 7.5 10.0 1 ... 10.3 Constraint L91.CA W109.CA 5.0 7.5 10.0 1 ... 11.4 Constraint L91.CA M43.CA 5.0 7.5 10.0 1 ... 9.2 Constraint A25.CA M43.CA 5.0 7.5 10.0 1 ... 9.3 Hmm, nothing to write home about! Bumping up constraints (20 -> 30), bridge constraints (1 -> 2 and 2 -> 4) and alpha-helix distances (1 -> 2). Try21 running on peep. Mon Jul 10 16:11:51 PDT 2006 Firas Khatib Ran try22 using edit3 as input (which was a Proteinshop model that Martin had created). It was not able to form the beta-bridge. I am going to check the costfcn again and see what I can change. It seems like they are all good constraints, I will take out the helix constraints and increase the beta-bridge ones to see if that helps. I lowered breaks and increased constraints to 40 (just to see if this helps the beta-bridge at all!) This will be try23 running on squawk. Mon Jul 10 18:10:25 PDT 2006 Firas Khatib I am going to continue from try8 (to try to get a different type of structure) I took try8 and Proteinshopped it to get: 4helixFromTry8.pdb 4helixFromTry8.rotated.pdb This last model will be inputted for try24 with no constraints. Running on camano. Make started Mon Jul 10 18:54:17 PDT 2006 Running on peep.cse.ucsc.edu Mon Jul 10 19:06:50 PDT 2006 Firas Khatib I am going to try a run based on Robetta4, to try to get a 6-helix bundle, and because I like the sheet at T12 & M30. try25 is using: T0307.t2k.dssp-ehl2.constraints and is running on shaw with this additional constraint: StrandConstraint S7 F13 0.9 StrandConstraint D32 K36 0.5 Mon Jul 10 22:38:50 PDT 2006 Firas Khatib try24 scores best with the unconstrained costfcn as well as try24th costfcn! try25 scores not too badly... either Tue Jul 11 02:30:57 PDT 2006 Martin Madera Looking at the unconstrained cost function, my best attempt so far is try16. Have done more ProteinShop editing and will try to re-optimize the best attempt (decoys/edit9.pdb.gz) with the try16 cost function. Running as try26 on peep. Try27 is the same as try26 but soft_clashes x2 (10->20) and break x2 (25->50). Also removed the helix constraints and lowered all other constraints to 1 (from 2). Running on lopez. Tue Jul 11 07:47:02 PDT 2006 Martin Madera I like try26, the packing is pretty good even if breaks and clashes are horrible: it has the lowest dry6.5 out of all of our models so far! Alas submission deadline... (Try27 is worse.) Updated best-models, replacing try16 with try26. Tue Jul 11 08:07:08 PDT 2006 Kevin Karplus It seems that we are submitting try24-opt2 < 4helixFromTry8.rotated < 4helix.renum (hand made) try26 < edit9 < try16 < 1afvA try3 < 1ireA try1 < 1yqyA fully automatic try25 < ROBETTA_TS4 Tue Jul 11 08:18:41 PDT 2006 Kevin Karplus So submitted. Tue Jul 11 08:51:16 PDT 2006 Martin Madera Who cares, it's after the deadline, BUT: try28: 'polishing' try26 and edit9 (higher breaks and soft_clashes) try29: same as try28 but without the helix (& h-h distance) constraints, and all constraints except the beta sheet one -> 1. Tue Jul 11 09:44:42 PDT 2006 Firas Khatib deadline is noon, we can always resubmit if it is finished on time and if you find it to be better.