Tue Jul 13 09:00:40 PDT 2004 T0239 DUE 20 Aug Tue Jul 13 11:50:47 PDT 2004 Kevin Karplus Sequence only finds itself in NRP---real ORFan! Neither t2k nor t04 find anything. Naturally there is not much consensus then on what the fold is. Tue Jul 13 12:40:45 PDT 2004 Kevin Karplus There are a number of weak hits to a.4.5.21 (1awcA) a DNA-binding protein that sticks a helic into the major groove, but these may all be close variants of the same sequence. That domain has only one 4-strand anti-parallel sheet. There seems to be consensus that there are 6 strands: s1: H4-T9 s2: C24-Y28 s3: V32-K35 s4: K41-S45 s5: C83-I87 s6: V91-R96 and 2 helices: h1: K13-L21 h2: L58-N66 We have s2 ^v s3 ^v s4 and s5 ^v s6, which leaves the two connections made by the helices. Some possibilities include with both helices on the same face s1 || s2 ^v s3 ^v s4 ^v s6 ^v s5 s1 || s4 ^v s3 ^v s2 || s5 ^v s6 s1 || s4 ^v s3 ^v s2 ^v s6 ^v s5 I like the last one, because it makes good use of the long helix. with the helices on opposite faces s4 ^v s3 ^v s2 ^v s6 ^v s5 || s1 It is possible that there are disulfide bonds (one template aligns to a C34-C80 disulfide). Tue Jul 13 15:45:36 PDT 2004 Kevin Karplus I was hoping that the template library might find the sequence, since they have homologs (mostly), even though the target alignment is single-sequence, but no such luck. try1 is finished. The model looks unlikely (doesn't agree with the secondary structure prediction and not too well with the burial either). There are 6 CYS residues, and I suspect that this is held together by disulfides. There are 15 possible pairings for the 6 CYS. Maybe we should just run 15 different cost functions with the different SSbond pairs, and see what comes out looking most reasonable. Tue Jul 27 17:17:15 PDT 2004 Martina Koeva Marcia and I did the following work on this target: 1. Submitted T0239-try1.opt2.pdb to VAST. Search ID: VS60158 Password: T0239try1 There were 9 hits most of which look like trash - only very short segments of aligned residues and P-values on the order of 0.05 and higher. There is one good hit and the alignment could be used by undertaker. For Marcia to do: incorporate the alignment to top structural hit. 2. Edited try1 into try2.costfcn, adding the first combination of pairs of Cys residues for the formation of disulfide bonds: SSBond C22 C83 1.0 SSBond C24 C80 1.0 SSBond C34 C54 1.0 Kept the weights on SSBond constraints to their default values, but increased: - constraints weight => from 10 to 20 - hbond_geom_beta => from 4 to 6 (also rescaled other hbond weights) For Marcia to do: prepare all other possible combinations of cys residue pairs in appropriate .costfcn files. 3. For Marcia to do: Define a strand rasmol script. Wed Jul 28 14:59:54 PDT 2004 Marcia Soriano I created two rasmol scripts that define all the six strands and the two helices. They can be identified as s1 through s6 and h1 through h2 respectively. Martina mentioned that I should create a soft-link to them, but did not do so. The files can be found in the T0239 directory as strands and helices. Martina showed me how to add VAST alignments and use them for an undertaker run. We only used the top hit from VAST because the P-values of the other hits were trash (0.05 and higher). Finally, I prepared try2-try11.under and .costfcn files for all the possible combiinations of cys residue pairs. I felt that combinations containing C22-C24 pairs and C80-C83 pairs might not be plausible because they are so close to each other. Wed Aug 4 13:38:55 PDT 2004 Marcia Soriano I have attempted to run the trys on the condor cluster but have been unsuccessful. I created a decoy foo.under file before I ran the script to the trys in order to make sure that there would be no problems. For some reason, something will not permit me to write a .log file. Running the command when I directly ssh to cc01 works though. I have contacted Jorge Garcia about the problem and have yet to recieve a response. Mon Aug 9 19:57:53 PDT 2004 Marcia Soriano I completed all try12-try16.under and .costfcn files for the remaining possible combinations of the pairing of cys residues. Kevin told me that although the two residues may be close to each other, it is rare, but still possible for two cys residues to form disufide bonds even though they are almost next to each other. Mon Aug 16 14:53:04 PDT 2004 Marcia Soriano I have just completed try17.costfcn and try17.under and will start running it. I commented out all 'include T0239.t2k.---.constraints' and left the t04.constraints found towards the bottom of the .costfcn file. I added known_ssbond with a weight of 1. I also changed the weights of pred_alpha2k to 2 and pred_alpha04 to 4. I am unsure about a couple of things: if I should have left pred_alpha2k or pred_alpha04 alone or if I should have changed the weights and if I should remove the constraints. If I run into problems, I will ask.From karplus@soe.ucsc.edu Tue Aug 17 12:10:44 2004 Date: Tue, 17 Aug 2004 12:10:42 -0700 From: Kevin Karplus To: marcias@soe.ucsc.edu, marcias@ucsc.edu CC: karplus@soe.ucsc.edu Subject: T0239 At yesterday's meeting, we identified several of the possible disulfide pairings as being worth further attention. Notes on our discussion should be in the T0239/README file, but don't seem to be. At the very least you should list the ones we will go forward with, and identify which of your subsequent tries corresponds to each of the pairings. Kevin Tue Aug 17 12:49:44 PDT 2004 Marcia Soriano At yesterday's meeting, we took a look at several trys and decided to go forward with the following : try13 SSBond C22 C54 1.0 SSBond C24 C34 1.0 SSBond C80 C83 1.0 try3 SSBond C22 C83 1.0 SSBond C24 C54 1.0 SSBond C34 C80 1.0 try5 SSBond C22 C80 1.0 SSBond C24 C83 1.0 SSBond C34 C54 1.0 Tue Aug 17 13:26:42 PDT 2004 Kevin Karplus Marcia and I are looking at try17-opt2, which was optimized with a cost function similar to try13, but starting again from alignments. The results were not as good as try13 as the sheet that formed there did not form in try17. Perhaps we should add a sheet constraint pairing E87 and D23 and C83 and Y27. I'll set up try18 for Marcia as a polishing run (starting from existing models, rather than from alignments). Tue Aug 17 15:03:06 PDT 2004 Marcia Soriano I am currently running try18, try19, and try20. Try19 and try20 are polishing runs for try3 and try5, respectively. For try19 and try20, I used try18.under and .costfcn as a template for try19 and try20. The cys pairing for try19 was based off of try3 and the pairing for try20 was based off of try5. For the try19 and try20.costfcn files I commented out the constraints, left the corresponding cysteine pairings and kept known_ssbond with a weight of 1. In the .under files, I commented out the blocks of 'TryAllAlign' lines and UNcommented out the 2 lines below it, the line that starts with 'InfilePrefix'. Tue Aug 17 18:10 2004 Bret Barnes I went ahead and submitted another VAST job since try19-opt2 seems to be looking like our best model. Perhaps it will give us one last alignment to add. Job ID: VS60862 PASSWORD: T0239B Tue Aug 17 21:14:20 PDT 2004 Kevin Karplus It looks like the SSBonds do not have enough weight---they are not forming in try18, try19, and try20. I increased the weight of maybe_ssbond in unconstrained.costfcn, so that the unconstrained scores represent a better idea what models are working. The new unconstrained cost function likes try13 and try17 best. Try18 does poorly, and completely fails to form a sheet. try19 forms a nice sheet, but lacks the ssbonds. Ah---I see the problem. The SetCost function is terminated by the commented-out line, so the rest of the cost function is ignored. Note: it is important to run a "make decoys/score-all.try19.pretty" before doing a long run, since errors caught early are easier to fix. What try19 appears to have gotten is C80-C83, C24-C34, C22-C54, which is the try13 cost function again. This pairing seems to be the most favored one, and deserves the most attention. I'll do a polishing run with those ssbonds an a properly constrained cost fucntion as try21, then set up the other runs that Marcia had intended to run. Try22 will have the try3 combination: SSBond C22 C83 1.0 SSBond C24 C54 1.0 SSBond C34 C80 1.0 Try23 will have the try5 combination: SSBond C22 C80 1.0 SSBond C24 C83 1.0 SSBond C34 C54 1.0 I wasn't sure whether Marcia had intended these to be exploratory runs (from alignments like the first 15 runs), to get a different solution for each of the models, or polishing runs (using existing models). I set them up as polishing runs, but we might also want to do some more exploration. The VAST run that Bret started is finding only insignificant hits (15 or 16 residues with two gaps to 1quzA and 1qkyA). Nothing useful enough to copy the alignment for. Tue Aug 17 22:33:59 PDT 2004 Kevin Karplus With the jobs only half finished, it looks like try21 will be our first model, and try22 our second. Who knows what might happen in the rest of the optimization and with try23? Marcia, were there any others further down that looked worth trying? Try15? try14? Wed Aug 18 08:54:49 PDT 2004 Kevin Karplus I realized that one reason some of the run of 15 did not form all their SSbonds was that the InsertSSBond and ImproveSSbond operators were turned off in the initial runs that Marcia did. This is not entirely a bad thing---the folds one gets without those operators may be a little more natural, so it might, in fact, be better to choose the pairing when the operators are not turned on. I did turn them on for try21, try22, and try23. try21-opt2 scores best with the unconstrained cost function, and it looks best to my eye also. It could be packed just a tiny bit tighter, but is not bad. try22 and try23 don't look as good to me, but we might submit them as alternative models. Rosetta likes try14-opt2.repack-nonPC best, then try21-opt2.repack-nonPC. Maybe I'd better do a polishing run with the try14 constraints also. Try24 will use the try14 constraints: SSBond C22 C24 1.0 SSBond C34 C83 1.0 SSBond C54 C80 1.0 I'll also do a polishing run with the unconstrained costfcn, which will probably work mainly to improve try21-opt2. Wed Aug 18 10:04:03 PDT 2004 Kevin Karplus I need a chart of ss bonds vs tries 22 22 22 22 22 24 24 24 24 34 34 34 54 54 80 24 34 54 80 83 34 54 80 83 54 80 83 80 83 83 try2 X X X try3 X X X try4 X X X try5 X X X try6 X X X try7 X X X try8 X X X #inconsistent (C54 twice) try9 X X X try10 X X X try11 X X X try12 X X X try13 X X X try14 X X X try15 X X X try16 X X X 22 22 22 22 22 24 24 24 24 34 34 34 54 54 80 24 34 54 80 83 34 54 80 83 54 80 83 80 83 83 try17 X X X =try13 try18 X X X =try13 try19 X X X =try3 (broken costfcn) try20 X X X =try5 (broken costfcn) try21 X X X =try13 try22 X X X =try3 try23 X X X =try5 try24 X X X =try14 try25 -------------------------------------------- (maybe_ssbond) 22 22 22 22 22 24 24 24 24 34 34 34 54 54 80 24 34 54 80 83 34 54 80 83 54 80 83 80 83 83 try26 X X X fix try8 try8 should have been 22-54, 34-80, 24-83 Let's try that as try26, from alignments. Wed Aug 18 10:51:00 PDT 2004 Kevin Karplus It may also be worth retrying the combinations that did not form all their ssbonds on the first try, with InsertSSBond turned on: try7, try2, try9, try6, try4, try10, try16, try8, try12 Hmmm, that's a lot to redo, and there is a reasonable argument that the disulfides should come close in a "natural" folding of the peptide chain, so ones that can't be formed without extreme forcing may not be worth considering. Wed Aug 18 11:20:57 PDT 2004 Kevin Karplus It looks like the try24-opt2, which score fairly well, is also quite close to a solution to the try4 constraints, which avoid the C22-C24 connection. I think it is worth setting up a run for the try4 constraints also. That will be try27. try26-opt2 makes the requested disulfides, but is otherwise pretty unconvincing. Wed Aug 18 14:43:35 PDT 2004 Kevin Karplus try27, the improved run for try4 does respectably. If I were to submit now, I'd submit try25-opt2 best overall (in try13, try17, try18, try21 group) try22-opt2 (best in try3 group) try24-opt2 (best in try14 group) try23-opt2 (best in try5 group) try1-opt2 full-auto try27-opt2 is the next one down the list. I just redefined the unconstrained costfcn, upping the soft_clashes and breaks. Doing that moves try27-opt2 ahead of try23-opt2. I'll do a polishing run with unconstrained cost fcn as try28. This will just polish try25 some more. I wonder if it is worth the time to try polishing try23 and try27 by upping the clashes and breaks on their respective cost functions? try28 made a small improvement over try25, mainly in clashes. It still has two bad breaks: T0239.try28-opt2.pdb.gz breaks before (T0239)H58 with cost 1.10783 T0239.try28-opt2.pdb.gz breaks before (T0239)I37 with cost 0.744638 The first one looks like a bond-angle problem around the L57.CA The P36.C-I37.N bond is a little long, but doesn't look terrible. Wed Aug 18 17:18:06 PDT 2004 Kevin Karplus For try29, I'll try polishing try23, but with constraints to force the currrent gaps to close For try30, I'll do the same for try27. If these work well enough to move try23 and try27 ahead of try22 or try24, I should probably redo those the same way. Thu Aug 19 08:25:39 PDT 2004 Kevin Karplus try28 is still the best scorer, of course, but try30 has moved ahead of try24 (though not try22), so I probably need to redo try22 and try24. try29 scores much worse. I don't like the weird crossover from one strand to another in try29, so don't mind much that it scores badly. Try30 looks ok, but not exceptionally good. Maybe I should do a gap-closing run for try28 also. run from try31 try24 try32 try22 try33 try28/try25/try21 Thu Aug 19 09:22:05 PDT 2004 Marcia Soriano Yesterday I 'googled' T0239 and found a forum for CASP (FORCASP, http://www.forcasp.org). This information given by Alexey may be of use: Alexey : "Re: CASP6 Target t0239 discussion" | Date: August-16th/04 | Score (1) A small protein of orphan sequence and unknown function, this would be an ideal target for New Fold methods but is spoiled by additional structural info in the title of its PDB deposition 1RKI "Structure of pag5_736 from P. aerophilum with three disulphide bonds". It could make a difference in filtering a correct prediction. Thu Aug 19 11:58:10 PDT 2004 Kevin Karplus We had already assumed 3 disulfide bonds, so this won't change our predictions, but it is good to know that we guessed right. try28 scores slightly better than try33 on the unconstrained cost function, as try33 just moved the gaps rather than really closing them. We could try another polishing run, but I don't think that it'll make enough difference to be worth doing. Our best models in the top lineages are try28-opt2 (try33, try25, try21, try18, try17, try13) try32-opt2 (try22, try3) try31-opt2 (try24, try14) try30-opt2 (try27, try4) try23-opt2 (try29, try5) try26-opt2 (try8 replacement) try15-opt2 The best Rosetta energy is for try33-opt2.repack-nonPC, then try14, try25, try21, ... I'm not going to submit a Rosetta-repacked model for this target, preferring instead to spread my bet among the different SS bond pairings. So my main decision left is whether to submit try28 or try33 as my first model. try28 scores better unconstrained and with the try28 costfcn, try33 scores better with rosetta and with the try33 costfcn. The differences between them, when viewed with rasmol, are so small that it is probably irrelevant, so I'll go with try28-opt2. I'll submit try28-opt2 (try33, try25, try21, try18, try17, try13) try32-opt2 (try22, try3) try31-opt2 (try24, try14) try30-opt2 (try27, try4) try1-opt2 full auto Thu Nov 18 23:36:45 PST 2004 Martina Koeva Based on the smooth gdt scores: best sam-t04 18.9207 (try12-opt2) best submit 17.8215 (also model1) model1 17.8215 auto 15.7627 align 15.1614 robetta best 23.8265 (robetta model10) robetta1 21.8487