Tue Aug 10 11:19:14 PDT 2004 T0270 DUE 30 Aug 2004 Tue Aug 10 12:42:21 PDT 2004 Kevin Karplus Uh, oh, this looks like another new-fold prediction. There seems to be no agreement between the t2k and t04 predictions. Tue Aug 17 14:04 2004 Bret Barnes So even though one of the strands in try1 wasn't predicted to be a strand (M184 - A190) I am still going to use it in the constraint function for try2. So that means my constraints will look something like the following. SheetConstraint (T0270)I64 (T0270)V67 (T0270)F76 (T0270)D73 hbond Y65 5 SheetConstraint (T0270)M184 (T0270)A190 (T0270)G199 (T0270)L193 hbond V186 5 Along with another sheet that I'm going to add: SheetConstraint (T0270)F108 (T0210)E113 (T0270)G140 (T0270)Y145 5 If this sheet constraint that I am adding is infact a sheet then it might form a larger sheet with the second strand (F76 - D73) running parallel with a helix inbetween. Before I try adding any other conjectures to how I think this target folds I want to do a simple run with the three sheet constraints listed above. In the mean time I'll be looking at the conserved residues and see if there is nothing be noticed there. Opps, I messed up writing down the constraints for try2.costfcn. I'm going to let it finish, but I'm going to re-run it in try3.costfcn. Looking at the Robetta models (in paticular try10) there seems to be evidence for the three strands to form a sheet: 202-195 185-192 184-189 237-233 This is very similar to our predictions for: SheetConstraint (T0270)M184 (T0270)A190 (T0270)G199 (T0270)L193 hbond (T0270)V186 5 Which did not have any secondary prediction for 184-190, but I now feel more confindent in it. I will probably add the 237-232 strand prediction into try3 which we also show predictions for. I'll probably use the Robetta predictions for the last three strands. So what I am thinking so far is on possible sheet 64: IYQV :67 77: LFLLDA :72 113: EVVSYF :108 140: GYVCFY :145 202: LDVGWEWD :195 184: MQVISGAQG :192 237: YKGVF :233 SheetConstraint (T0270)I64 (T0270)V67 (T0270)F76 (T0270)D73 hbond (T0270)Y65 5 SheetConstraint (T0270)L77 (T0270)A72 (T0270)E113 (t0270)F108 5 SheetConstraint (T0270)E113 (T0270)F108 (T0270)G140 (T0270)Y145 5 SheetConstraint (T0270)L202 (T0270)D195 (T0270)Q185 (T0270)G192 5 SheetConstraint (T0270)Q185 (T0270)G189 (T0270Y237 (T0270)F233 5 Using the above constraints for try3 running on baa. Tue 17 16:45 2004 Bret Barnes Hmm, looking at Robetta 9 you see evidence for a strand forming at 154-157. This could be very possible and might help countinue to build a sheet. Tue Aug 17 19:00 2004 Bret Barnes So, while try3 doesn't score the best, but it is starting to get the sheets to form. I'm going to keep the sheets that were formed in try3 and try to strengthen them a bit. Also I'm going to try to add the sheet from try9 (154-157). I might add some strand constraints to try to get the strands to form where I think they should be, since the secondary structure predictions are not very high. After try4 finishes if it looks good, I'll do a VAST search and see if we can improve the alignment. Opps, I messed up try3 sheet constraint two (used a lowercase t instead of an uppercase T). I guess I'll fix that in try4. Tue Aug 17 11:31 2004 Bret Barnes Ok, so here are the sheet constraints that decided on for try4 and try5. Both try 4 and 5 will include the following two sets of sheet constraints: # 14: GWHVLH :19 # NEW # 33: PAFSWR :28 # NEW SheetConstraint G14 H19 P33 R28 5 # 64: IYQV :67 # 77: LFLLDA :72 #113: EVVSYF :108 #140: GYVCFY :145 #160: LMYWND :155 # NEW SheetConstraint (T0270)I64 (T0270)V67 (T0270)F76 (T0270)D73 hbond (T0270)Y65 5 SheetConstraint (T0270)L77 (T0270)A72 (T0270)E113 (T0270)F108 5 SheetConstraint (T0270)E113 (T0270)F108 (T0270)G140 (T0270)Y145 5 SheetConstraint (T0270)G140 (T0270)Y145 (T0270)L160 (T0270)D155 5 try4 will also contain the next set of sheet constraints: (running on woof). #202: LDVGWEWD :195 #184: MQVISGAQG :192 #237: YKGVF :233 SheetConstraint (T0270)L202 (T0270)D195 (T0270)Q185 (T0270)G192 5 SheetConstraint (T0270)Q185 (T0270)G189 (T0270)Y237 (T0270)F233 5 While try5 will instead have this set of sheet constraints: (running tweet) #198: WGVDLF :203 #211: KFQVPD :206 #237: YKGVFF :232 SheetConstraint (T0270)W198 (T0270)F203 (T0270)K211 (T0270)D206 5 SheetConstraint (T0270)K211 (T0270)D206 (T0270)Y237 (T0270)F232 5 These sets of constraints were based on looking at Robbetta models and unfinished results of try 2 and 3. Here goes hopefully we'll get some better looking structure to form out of this. At which point I would like to do a VAST search on both try4 and 5 if they look better. Tue Aug 18 0:55 2004 Bret Barnes Try4 and try5 don't look too good so far (still not done), but I noticed that the RR constraints havn't been included. I'll work more on this tomorrow. Wed Aug 18 12:19:00 PDT 2004 Kevin Karplus Bret, please remember to use the T0270.do6 target. I had to go back and remake all the missing Rosetta repackings. At 249 long, this target is quite likely to be 2 domains. It might be worth looking at the domain predictors on the CASP6 servers site to see where there might be a domain break. Based on just the conservation patterns (a weak signal at best), there may be a break before P134. You are also using old cost functions, with the sidechain weight set much too high. You might want to pick up the latest guess at a reasonable cost function by copying ../starter-directory/try1.costfcn to try6.costfcn, and then replacing XXX0000 by T0270 and editing down the set of constraints (including all the different constraint files does not make much sense after you've looked at the first model). I'll make a try6.costfcn, using the t2k constraints, which seem stronger than the t04 ones for this target. Note: the Makefile has been set up with the belief that the t04 predictions are better, but I don't know why I (or someone else) believed that in this case---perhaps because the t04 alignments have more sequences in them, so may have more signal from more distant homologs. Wed Aug 18 14:33:52 PDT 2004 Kevin Karplus Picked up new Template.atoms on try6. Don't forget to comment it out on the next try! Wed Aug 18 14:37:28 PDT 2004 Martina Koeva I haven't looked much at this target, but as a start I will collect information from the domain predictors and put it in the README. Baker-Robetta-GINZU: no domain break Baker-RosettaDOM: no domain break baldi-group-server: no domain break cubic-chopper: break at 34-35 DomPred: no domain break DomSSEA: no domain break Dopro: no domain break SSEP-Align: break at 120-124; or no domain break Sternberg Phyre: a break somewhere between 29-41 Wed Aug 18 17:32:33 PDT 2004 Kevin Karplus try6-opt2 looks just as bad as all the other tries. I think I'll leave this one for Martina and Bret---if they get anything it will be better than the trash we have so far. Thu Aug 19 14:41:47 PDT 2004 Martina Koeva VAST ID: VS60924 Password: T0270try2 As expected, VAST did not find anything - there were three hits that did not look very plausible. On the other hand, I went back and looked at our template hits and decided to include 1e8gA, 1vaoA and 1qltA into TOP_MANUAL_HITS, for which I ran make extra_alignments and updated all-align.* files. All of those hits belong to the same fold as 1w1kA. I will have to look at them more closely and see whether we can try to model T0270 after the templates in d.58.*(32 and 33), which come up on a number of occasions as hits. Random note: T0270 has 6 histidines (H4,H10,H16,H19,H70,H172) but they are not conserved, except for one (H172). There are also some well-conserved charged residues. Metal-binding site? Thu Aug 19 18:03:46 PDT 2004 Martina Koeva I have setup a extract_robetta_sheets.under script and have extracted the sheets from the Robetta models. I have found useful mostly the Robetta model 10, which includes 2 sheets constraints close to the constraints that Bret has mentioned above. I have included the constraints from robetta10.sheets into try7.costfcn. Fri Aug 20 03:24:28 PDT 2004 Martina Koeva Boo! Try7 forms a nice sheet, but from segments that we have helix predictions for and those are not even the strands that I was looking for. It looks pretty terrible. I might raise weight in the next try. Fri Aug 20 15:21:52 PDT 2004 Martina Koeva Here is a comment on T0270, posted on the FORCASP site: Alexey : "Re: CASP6 Target t0270 discussion" | Date: August-18th/04 | Score (1) Yet another target with the solution to Fold Recognition problem found on the Web. The image of the structure of homologous protein RBSTP1918 from B. stearothermophilus (54% sequence identity, the Midwest Center for Structural Genomics target APC35880) is available from the MCSG 3-D structures gallery (http://olenka.med.virginia.edu/mcsg/images/structures/1T0Tx500r.jpg). There is a chance of a timely release of the PDB coordinates 1T0T, deposited 12-Apr-2004, enabling conventional Comparative Modeling, like T0245 and T0261. Here are a few insights from its preliminary classification to facilitate the picture-guided modeling, if necessary. The 1T0T subunit contains two structural repeats of a ferredoxin-like fold. The overall pentameric structure of 1T0T is remarkably similar to the decameric structure of muconolactone isomerase 1MLI with each repeat corresponding to one 1MLI subunit. Hypothetical protein HI0828 is a probable distant homologue of 1MLI with the closest subunit structure, 1MWQ. It has a dimeric structure resembling one 1T0T subunit. The target family appears to share with 1MLI and 1MWQ not only the common fold but also the active site location and architecture. Its invariant His residue (in the C-terminal repeat) can be mapped to the equivalent site shared by the invariant His residue in the 1MWQ (YciI) family and the invariant Glu residue of the 1MLI family. Fri Aug 20 15:36:13 PDT 2004 Kevin Karplus 1mli is in scop family d.58.4.1. Our 3rd hit with sam-t02 is in the same superfamily (1qjhA), as is our third sam-t04 hit (1n0uA). We could try making models from those alignments (as in show-align.under) then picking up sheet constraints (as in superimpose-best.under), and seeing if they get us anywhere. We could also use only templates from those fold families in the TryAllAlign (include the read-alignments-scwrl.under files for those directories, rather than using the full all-align.a2m). That would at least concentrate our search in a reasonable part of the space. I don't think we should put too much effort into this target though, since we are not interested in picture-guided prediction. Sun Aug 22 18:16:59 PDT 2004 Martina Koeva I looked at the two hits in the same superfamily that we have and for 1qjhA all the alignments that I checked in the directory had not been made (the files existed but were empty). For 1n0uA, we get very short segments of aligned residues. So we are very unlikely to be able to pick up sheet constraints from them. I tried including the 1mli and 1mwqA in the MANUAL_TOP_HITS and making extra alignments for them. I have created show-align-1mli.under and show-align-1mwqA.under scripts and they have generated two pdb files: T0270.1mli.undertaker-align.pdb and T0270.1mwqA.undertaker-align.pdb. I have also made a extract-1mwqA-sheets.under (misleading name, since I also extract the helices). I have picked up the sheets for all 10 alignments that I included in the show-align-1mwqA.under script. I will create a try8 and try9 cost functions, so that try8 will include the sheet constraints picked up from align4-1mwqA.sheets and try9 will include the sheet constraints picked up from align7-1mwqA.sheets. Sun Aug 22 20:49:03 PDT 2004 Martina Koeva Both try8-opt1 and try9-opt1 are already done and the models look quite interesting. Both models have started forming larger sheets, but they seem to be using the segments that we predict as helices to form some of the sheets. Mon Aug 23 18:16 2004 Bret Barnes Martina showed me how to set up a run (try11) based off of the sheet constraints produced by the alignments she made (align5-1mwqA.sheets). I also made try11 only read from 1mwqA and 1mli alignments. try11 running on meow. I'm spending some time now setting up try10. Still thinking about constraints... Tue Aug 24 13:49:35 PDT 2004 Martina Koeva I had put in some notes on try11 that apparently got overwritten by Bret. Looking at the try11-opt2 results, I am inclined to want to go back to using all alingments that we have, instead of just the 1mli and 1mwqA that I used in the setup of the try11.under script. Since there is still enough time to build more models on this target, I am thinking of using some more of the sheet constraints in the align*-1mwqA.sheets to see whether we get any reasonable models. In particular, I am interested in: align1-1mwqA - has two helices in the alignment and a sheet, which despite the breaks in the alignment (?) looks pretty reasonable. The alignment looks good from burial perspective too, based on how the helices pack against the sheet. align2-1mwqA - has only one of the larger helices aligned align3-1mwqA - too messy, hopefully I will not have to rely on this alignment, there is a whole sgement, predicted to be a strand (str2) that lines up in the sheet; don't like it For now, I will start try12, using the align1-1mwqA sheets and helices and will start from all of the alignments, instead of only the 1mli and 1mwqA ones. Wed Aug 25 10:58 2004 Bret Barnes So, try12 seems to match secondary predictions (str2) pretty well. Now it might just be a process of refining the sheet constraints on try12. I had previous constraints for try10 (not used yet), but I think I will redo them after looking at what came up in try12. So, this is what I came up with. The sheet constraints from try12-opt2 along with the sheet constraints used in try12.costfcn that seem to be reasonable weights were based on how likely the sheets look in try12. # Sheet Constraints from try12 SheetConstraint D73 F76 F108 V111 hbond L75 10 SheetConstraint S116 E118 G176 K178 hbond Q117 10 SheetConstraint G140 C143 Q154 R151 hbond Y141 10 # Constraints used in try12.costfcn that are close in try12-opt2 and # predicted to be strands (str2) SheetConstraint (T0270)G14 (T0270)D20 (T0270)S204 (T0270)W198 hbond (T0270)H16 10 SheetConstraint (T0270)W15 (T0270)D20 (T0270)Y237 (T0270)F232 hbond (T0270)V17 10 SheetConstraint (T0270)Y62 (T0270)I64 (T0270)D201 (T0270)G199 hbond (T0270)Y62 4 SheetConstraint (T0270)E113 (T0270)L114 (T0270)G199 (T0270)W198 hbond (T0270)E113 4 I'm also going to keep the RR constraints. Running try10 on meow. Wed Aug 25 17:44 2004 Bret Barnes So, try10 is starting to oreint the strands together and build a sheet. All the sheets are pretty short though. However they seem to be collecting in the right spot. Try12 still scores better with try10's cost function, but try10 isn't far behind. I'm going to try to extend the sheet constraints that try10 produced and incorperate adjacent strands into the sheets. In paticular get 15-20 to unravel into a sheet instead of a helix. I have to help a friend study for a calculus test tonight though, so I might not get around to this until tomorrow, but we still have five days before this target is due. Thu Aug 26 10:34:53 PDT 2004 Sol Katzman On the FORCASP site, Alexey says: The 1T0T coordinates have been released today. Happy comparative modeling of T0270. Date: Thu, 26 Aug 2004 10:46:01 -0700 (PDT) From: Solomon Katzman Subject: Need 1t0tA for T0270 Dear Kevin, On FORCASP, Alexey says we should be doing comparative modeling of 1t0t for target T0270. It was just released. I did a pdb-get 1t0t and it looks like a homo-5mer, but I guess you need to create the models for it so we can use it. /Sol. Date: Thu, 26 Aug 2004 11:26:16 -0700 From: Kevin Karplus Subject: Re: Need 1t0tA for T0270 I am adding 1t0tV to the template library now. See /projects/compbio/tmp/para-trickle-make-karplus-14541-26019111/README for more info on the first phase of the creation. When that has finished (which will probably take a few hours), I'll do a "make REDO_SEARCHES" in casp6/T0270 so that the searches are redone with the new library. Actually, it might be worth renaming T0270 to T0270-pre-1t0t and starting T0270 over completely from scratch. That will take an extra hour or two, but would allow us to do a "fully automatic" prediction that includes 1t0tV in the template library.