Mon Jun 19 08:34:14 PDT 2006 T0335 Make started Mon Jun 19 08:35:26 PDT 2006 Running on whidbey.cse.ucsc.edu Mon Jun 19 08:39:03 PDT 2006 Kevin Karplus BLAST doesn't find anything in PDB. The best hit is 1x18E 32%id over 56 residues (E-value 0.440). At least this is short (85 residues), if it turns out to be a new-fold problem. Wed Jul 5 10:51:20 PDT 2006 Navya swetha Davuluri The residues 67, 48 and 45 in the two helices are predicted to be buried, but they are exposed in try1-opt2. The residues from 52-55 are predicted to be part of a strand, but undertaker made them as part of a helix. I am going to start a new run with increased weight on dry12 component and add a strand constraint for residues 52-55. I removed maybe_metal and maybe_ssbond as there are no cys. try2 running on peep. Wed Jul 5 12:00:51 PDT 2006 Navya swetha Davuluri Got some error messages and the job did not finish. Made some changes to try2.under file and started a new run try2 running on camano. Wed Jul 5 16:54:57 PDT 2006 Grant Thiltgen I can't read try2 so I don't know what is wrong with it. Although looking at the protein and the constraints, I'm not sure that those few residues are supposed to be a strand. Thu Jul 6 12:55:05 PDT 2006 Grant Thiltgen Actually, residues 53-55 are fairly highly predicted to be a strand. There's also a weakly predicted strand from residues 60-62, which could be part of a two strand sheet (maybe). It's a very weak prediction though, and I'm not sure there's anything there. I am just not sure if there would be single strand by itself floating around in a protein. The n_notor prediction has a high prediction for a separation of three at G59. (0.756) There's an okay prediction for a separation of three at residues V21 (0.351) and N60 (0.432). The o_notor prediction has a high prediction for separation of three at D56 (0.670), which corresponds with the prediction at G59, so it could indicate that there is a separation of three there, and it might be two strands. The top scoring Robetta model has two strands in an anti-parallel sheet, but not in the same location we're predicting. I'm going to try to make the two strand sheet in try3. Thu Jul 6 14:29:55 PDT 2006 Grant Thiltgen It turns out that Navya ended up doing a polishing run of try1-opt2, which is why it looks almost identical (backbone wise) to it. Thu Jul 6 14:36:57 PDT 2006 Grant Thiltgen I'm putting in a strand constraint to pair up residues K53-I56 to N60-V62 in anti-parallel formation. I'm also putting in the hydrogen bond constraint for D56-G59. There is also a hist-tag at the end of the molecule. I changed the last helix constraint to not include the last 9 residues of the protein. Try3 started on orcas. Thu Jul 6 14:58:26 PDT 2006 Grant Thiltgen I tried looking at 1x18 to see what it looks like, but I think the PDB file is corrupted, because using 'rasmol `pdb-get 1x18` gave me nothing, and downloading it directly from the PDB website also didn't work. I'm looking to see if we have any alignments to that now. There isn't a 1x18E alignment. [Sat Jul 8 19:50:21 PDT 2006 Kevin Karplus 1x18 is there, but it is a CA-only model, so you can only really see it in spacefill. As a CA-only model, it isn't much use to us, as undertaker needs a complete backbone for templates. ] Thu Jul 6 15:58:41 PDT 2006 Grant Thiltgen Try3 scores worse than try2 on the costfcn, but it is better than try1. The sheet didn't exactly form, but instead makes a weird wrappy thing that I believe it got from the fifth alignment. I think it is definitely something to try to work on for submission since this may be a new fold. I am going to run try4, which is the same as try3, to see if I get anything different. Unfortunately, the hist-tag pushed up against one of the helices, and may be interfering with good packing. try4 started on camano. Fri Jul 7 11:25:19 PDT 2006 Grant Thiltgen While try4 scores slightly worse than try3, I think I prefer how try4 looks, mostly because the hist-tag is out of the way. I think it causes it to score worse on the unconstrainted costfcn though. Try3-opt2 scores better on try4.costfcn, but the try4 model still looks pretty good. I'm definitely going to use try2 and try4 for polishing, and I'm looking into what else I might be able to do to get some different models. I am also going to polish up some of the models we already have, so we have something for the soft deadline on Monday. I'm going to polish try2, try3, try4, the best robetta model and the best SAM model. Fri Jul 7 12:04:05 PDT 2006 Grant Thiltgen Try5 is a polishing run of try2 Try6 is a polishing run of try3 Try7 is a polishing run of try4 Try8 is a polishing run of the best SAM server model Try9 is a polishing run of the ROBETTA_TS1 model (best-scoring of servers with unconstrained costfcn) try5 and try6 started on orcas. try7 and try8 started on camano. try9 started on vashon. For the costfcn, I raised the weights of dry6.5 and dry8 slightly. I raised phobic fit to 3. I increased soft_clashes to 40 and breaks to 100. Fri Jul 7 13:43:08 PDT 2006 Grant Thiltgen I noticed that I used StrandConstraint instead of SheetConstraint in try3 and try4 costfcns. I'm going to change try4 to try10 and change the StrandConstraint to a SheetConstraint and see if it makes a difference. Since I've been having problems with the hist-tag, I am going to run it as both try10 and try11. Sat Jul 8 19:53:43 PDT 2006 Kevin Karplus If a HIS-tag keeps gettingin the way, it is sometimes useful to declare and predict a subdomain that excludes the HIS tag. I'll start a prediction for M1-H77. Sat Jul 8 19:58:43 PDT 2006 Kevin Karplus M1-H77 make started on cheep. Sun Jul 9 08:36:17 PDT 2006 Grant Thiltgen Try1 on the subdomain looks much better than the other tries. I'm going to try using the last run of constraints on the protein with the hist-tag to see if I can get anything new and interesting. Try2 and try3 are started on camano. Sun Jul 9 08:51:53 PDT 2006 Kevin Karplus I think that Grant is trying to make the middle helix a bit too long. It is T23-L49 in M1-H77/try1-opt2, and that is as long as it is predicted to be. Having the helix constraint extend to D56 (as in M1-H77/try2.costfcn and try3.costfcn) is *not* consistent with the sheet constraint. I'll do an optimization from alignments of M1-H77, but with the M1-H77 try1-opt2 helices and sheets. (The Hbond Grant wants is already formed there, but not the full sheet, so I'll use his sheet constraint.) Sun Jul 9 09:04:04 PDT 2006 Kevin Karplus M1-H77/try4 started on cheep. When try4 and try2 and try3 finish, I'll do M1-H77/try5 to polish the existing models and get as tightly packed a model as I can. When M1-H77/ try5 is done, I'll tack on a HIS tag from one of the full-length models and try optimizing once more with the HIS tag in place. If there are *different* multiple models from M1-H77, we'll have to optimize them separately. Sun Jul 9 10:08:06 PDT 2006 Kevin Karplus M1-H77/ try3 and try4 are quite similar to try1, with the unfortuante tendency to expose M45 and V52. M1-H77/ try2 is distinctly different, breaking the long helix, but doing a slightly better job of burying the hydrophobics. I've started try5 on cheep, to try to polish up the M1-H77/ try1,try3, and try4 models. (The try2 models are in the input also, but don't score competitively.) Sun Jul 9 10:16:38 PDT 2006 Kevin Karplus M1-H77/try1-opt2 is based on 2au5A M1-H77/try2-opt2 is based on 1f47B M1-H77/try3-opt2 is based on 1y96B M1-H77/try4-opt2 is based on 1f47B Interestingly, the most different one (try2-opt2) seems to be starting from an alignment to the same template as try4-opt2. M1-H77/try5 seems to be polishing just try1-opt2, probably becasue it started with the fewest clashes. Perhaps I should do another polishing run, starting with try3 and try4 gromacs models. Sun Jul 9 10:26:00 PDT 2006 Kevin Karplus M1-H77 try6 started on shaw. Sun Jul 9 10:40:53 PDT 2006 Kevin Karplus M1-H77/try5 polished up try1-opt2 and try6 is polishing up try3-opt2.gromacs0 It may be worthwhile to polish up just the model that Rosetta likes best. I'll wait to see whether that is T0335.try4-opt2.gromacs0.repack-nonPC.pdb, or whether try6 beats it. Sun Jul 9 10:50:40 PDT 2006 Kevin Karplus M1-H77/try6-opt2 is the new best scoring, but rosetta still prefers T0335.try4-opt2.gromacs0.repack-nonPC.pdb, so I'll do an optimization starting from just that model (M1-H77/try7 started on cheep) Sun Jul 9 11:28:48 PDT 2006 Kevin Karplus M1-H77/try7-opt2 (from try4-opt2) scores best now and rosetta likes best decoys/T0335.try7-opt2.gromacs0.repack-nonPC.pdb.gz Sun Jul 9 11:48:57 PDT 2006 Kevin Karplus I extended the M1-H77 try2, try6, and try7 by tacking on the HIS tag from the whole-chain try4-opt2. There are some bad clashes, so each of the extended-M1-H77-try* models will need to be separately optimized. Started these optimizations as try12 on cheep, try13 on shaw, and try14 on lopez. Sun Jul 9 11:58:37 PDT 2006 Kevin Karplus The basic models coming in from M1-H77 are *very* similar to the already existing try6-opt2, which scores best with try12.costfcn Sun Jul 9 12:31:07 PDT 2006 Kevin Karplus After try12,13,14 completed, best scoring with try12=try13=try14 costfcn: try12-opt2 try13-opt2 try6-opt2 try7-opt2 try4-opt2 try3-opt2 try11-opt2 Rosetta likes best: decoys/T0335.try8-opt2.gromacs0.repack-nonPC.pdb decoys/T0335.try5-opt2.gromacs0.repack-nonPC.pdb decoys/T0335.try2-opt2.gromacs0.repack-nonPC.pdb decoys/T0335.try11-opt2.gromacs0.repack-nonPC.pdb decoys/T0335.try8-opt2.repack-nonPC.pdb decoys/T0335.try12-opt2.gromacs0.repack-nonPC.pdb decoys/T0335.try9-opt2.gromacs0.repack-nonPC.pdb unconstrained likes best: try5-opt2 try6-opt2 try2-opt2 try12-opt2 try8-opt2 try13-opt2 try7-opt2 Putting these together, I'll look at try12-opt2 try13-opt2 try6-opt2 try7-opt2 try8-opt2 try5-opt2 try4-opt2 try2-opt2 try3-opt2 and try to pick out 5 good, somewhat different, models. I can drop try4 as dominated by try6, and try2 dominated by try5, and try3 dominated by try6, but that leaves me with 6 somewhat different models. Of these, I like try5 least, so I'll drop it. Sun Jul 9 12:49:03 PDT 2006 Kevin Karplus I'll do a preliminary submission today of try12-opt2 try13-opt2 try6-opt2 try7-opt2 try8-opt2 But I'm not really happy with these. We need to see if we can improve the burial and pack the proteins tighter. It might be worth seeing if the templates we used (2au5A, 1f47B, 1y96B) are dimers---nope: none are fold recognition---just long helix or a pair of helices matching. Make started Mon Jul 10 12:08:32 PDT 2006 Running on vashon.cse.ucsc.edu Fri Jul 21 19:11:56 PDT 2006 Navya swetha Davuluri I am going to pack the two helices together by adding a distance constraint between Met45 and Leu67 Constraint M45.CB L67.CB -10 7.0 14.0 1 I also increased the weight on dry6.5 , dry8 and phobic_fit. try15 running on vashon. Mon Jul 24 09:31:38 PDT 2006 Grant Thiltgen Navya forgot to change try14 to try15 in the try15 costfcn. Luckily, try14 was not one of our best models, so I don't think she overwrote anything useful. Mon Jul 24 10:06:50 PDT 2006 Grant Thiltgen I was looking at the t2k vs the t06 alignments, and I think we might be getting some better signal from the t2k alignments. The t06 sequence logo shows more conservation in the first 50 residues, but the t2k alignment is showing some better conservation from residues 51-70, so I may try to build a new model based solely on the t2k information. First, I am going to re-run the make in this directory to get a few more alignments. I am also re-running make on the subdirectory. Both of these started on lopez. Make started Mon Jul 24 10:15:36 PDT 2006 Running on lopez.cse.ucsc.edu Mon Jul 24 10:41:09 PDT 2006 Grant Thiltgen I set up try16. It is using the predictions from the str2 alphabet. It actually has a smaller helix toward the end. I may use it in both the subdomain and the whole protein, but I'll try the whole protein first. I think that the sheet might have been off by a residue, the sep alphabet seems to show a separation of 8 instead of nine in the sheet, so things may be a bit off. There is also a chance of a sep of 5 capping motif on the end of the C-terminal helix which I added into the costfcn. Try16 is only using the alignments from the top five t2k scores, and I will make try17 which uses more of them, but only from the t2k alignments. Mon Jul 24 14:10:06 PDT 2006 Grant Thiltgen The two new makes are finished. I'm starting try16 now on whidbey. I also started try17, which has the same costfcn as try16, but only uses the t2k alignments for everything. Try17 started on orcas. Mon Jul 24 15:13:31 PDT 2006 Grant Thiltgen Try16 and try17 are done. I'm not sure they look any better, but the sheet formed better. I think I am going to try to do the same thing, but in the subdomain. I am also going to remove the Proline from the first residue of the helix to see if I can get the helix to pack better. Try18 started on whidbey. Try8 in the subdomain uses the same costfcn as try18, but it doesn't have the hist-tag to get in the way. Try8 started on orcas. Tue Jul 25 12:59:26 PDT 2006 Grant Thiltgen Try8 on the subdomain looks really good. I'm going to try to make a chimera of it and add on the hist-tag. I took try8 of the subdomain and I added the hist-tag region of try9 for the whole protein to it. I'm going to attempt to polish this up a bit and get rid of the break from the hist-tag region. This is try19. I kept the constraints from the previous run to try to keep the protein the same except for the hist-tag. I increased the constraints on the model to keep things in place as well. Try19 started on whidbey. Tue Jul 25 14:28:29 PDT 2006 Grant Thiltgen I really like try19. It's a bit foamy in areas, but it packs fairly nicely in other and the secondary structure predictions works well. I'm going to work on polishing it and increasing the weights of things to pack it nicer. Try20 is a polishing run of try19. Try20 started on whidbey. Tue Jul 25 15:35:24 PDT 2006 Grant Thiltgen Try20 looks even better. I'm going to increase the dry weights and breaks again on try21 to attempt to make it even tighter and better! Try21 started on whidbey. Tue Jul 25 15:45:44 PDT 2006 Grant Thiltgen GAAAH! The stupid XXXXA.fasta file was write protected, so the last few of my runs haven't been making the rosetta repack models. Let me see if I can get those to remake... Although I don't want to rewrite anything I already have. Tue Jul 25 16:20:54 PDT 2006 Grant Thiltgen Try21 scores the best so far with the unconstrained costfcn. I am going to polish both try21 and the gromacs0.nonPC model in order to get this model good. I am also going to look into packing some of the other models we've already submitted to see if I can get them to look better. Both try22 and try23 started on whidbey. Wed Jul 26 05:54:02 PDT 2006 Kevin Karplus I think that the current first model (try12-opt2) looks good except that it relies too much on the HIS tag being there. The HIS tag is rarely an inegral part of a protein structure, since it is an experimental artifact. Look at the structures without the final LEHHHHHH and make sure that they make sense---a lot of times the HIS tag will be disordered anyway. In rasmol, restrict 1-77 will remove the HIS tag, or you can select 78-999 and display it with strands to make it transparent. try6-opt2 looks better to me than try12-opt2, once you remove the HIS tag. For that matter, try9-opt2 (from robetta) looks more convincing than many of ours. To get more diversity, it might also be worth trying to take the 5th model (try8-opt2) and proteinshopping it into a 4-helix bundle (maybe making a coil from E36-G40 for flexibility). The two pairs of helices there could be packed on either side. It looks to me like packing I10 and L13 against L67 might work. I don't have ProteinShop here in the hotel, so I'll have to leave this for Grant and Navya to work on. Wed Jul 26 11:41:20 PDT 2006 Grant Thiltgen Try21 looks good, and I think I am going to put that as the new model one. I am going to work on Proteinshopping try8-opt2 into a four helix bundle. Wed Jul 26 11:54:21 PDT 2006 Grant Thiltgen I used ProteinShop to make a four helix bundle out of try8-opt2. I did it by making E36-G40 into a coil and flipping the third helix around so that L67 was near I10 and L13. I am going to run it through undertaker with Helix constraints to keep the four helices intact and with some rr constraints to keep the residues near each other. I also used protein shop to extend the HIS tag out in order to remove that region from the four helix bundle. Wed Jul 26 12:08:28 PDT 2006 Grant Thiltgen Try24 is ready. I took the proteinshopped model and kept the helix constraints and a few residue constraints to keep the residues together. We'll see how it works. Try24 started on shaw. Wed Jul 26 12:21:43 PDT 2006 Grant Thiltgen Try25 is working on packing try9, the model based on the Robetta model. I included constraints to keep the helices and the sheets that we have, and increased the weights to work on packing things better. Try25 started on shaw. Wed Jul 26 13:26:52 PDT 2006 Grant Thiltgen Try24 and try25 are finished. Try24 looks pretty good, except the two halves of the bundle are a bit far apart. I am going to try to add some constraints to pull in the helices toward each other. Try25 looks pretty good too. I'm also going to try to pull in the foaminess of that model. Actually, I think I may just try twisting the helix of try24 that looks like it's a quarter turn off in proteinshop to see if that works, then run it through undertaker again. Wed Jul 26 14:01:32 PDT 2006 Grant Thiltgen Try26 is ready. I rotated the helix in proteinshop and moved the helices closer together. I am going to run it through undertaker and try to optimize it. Try26 started on camano. Try27 is another polish of the robetta model. Try27 started on whidbey Wed Jul 26 14:12:57 PDT 2006 Grant Thiltgen I am also trying to pack try6 better. Try28 is attempting to pack try6 better. Try28 started on shaw Wed Jul 26 14:52:22 PDT 2006 Grant Thiltgen Try29 is going to be a run trying pack tighter the best rosetta scoring model, which is try8-opt2.gromacs0.repack-nonPC.pdb. Try29 started on whidbey. Gaah. I guess I should redo try26, since it came out pretty bad. I will probably need to add some distance constraints to keep the helices together to avoid the clashes. Try30 is my redo of try26 with some distance constraints to try to avoid the helices from flying apart. Maybe I just moved them too close with proteinshop. Try30 started on shaw. Maybe I'll try optimizing try5 as well. It scores very well with the unconstrained costfcn and doesn't appear to be too off. Try31 is attempting to pack try5 better. It is started on shaw. Wed Jul 26 15:15:48 PDT 2006 Grant Thiltgen Try28-opt1 scores better than try28-opt2. Only by less than a point though. Wed Jul 26 15:49:32 PDT 2006 Grant Thiltgen Try29 is done. It is trying to pack the best scoring rosetta model better. I am going to run one more try of it and probably call it finished for that model. Try32 is that run. Try32 started on shaw. Wed Jul 26 15:57:31 PDT 2006 Grant Thiltgen Try30 is done. It does better than try26 in the unconstrained costfcn. I am going to work on packing it better too. This is try33. Try33 started on whidbey. I think I am going to leave try6 instead of trying to use the polished model try28, since it scores worse in the unconstrained costfcn, and opt1 of try28 scores better than opt2. I am also going to use try25 for the robetta models that we polished. Since try27 started scoring worse, and try27-opt1 does better than opt2, I am going to stick with try25. Wed Jul 26 16:57:10 PDT 2006 Grant Thiltgen Try33 scores better than try30, but the bundle has a beta-bridge. It could be possible since it is around the sheet prediction. Now I just have to figure out which models to submit. I think these are the models I've chosen: T0335.try21-opt2.pdb T0335.try6-opt2.pdb T0335.try25-opt2.pdb T0335.try33-opt2.pdb T0335.try29-opt2.gromacs0.repack-nonPC.pdb T0335.try21-opt2.pdb is our best scoring model by the unconstrained costfcn. I made this model by using the secondary structure predictions from the t2k alignments, because I felt they were more specific to this protein with the amount of conserved residues. I used the t2k alignments for the top scoring models to make the model as well. I used the str2, notor, and separation alphabets in order to determine the hydrogen bonding patterns for the sheet and the motif for the final helix. I then attempted to make the model less foamy in order to pack the helices tighter. I also made it from a subdomain model removing the HIS tag, then adding in a HIS tag from another model (try9-opt2, because this model didn't include the HIS tag in the final helix). T0335.try6-opt2.pdb also scores well in the unconstrained costfcn. I attempted to make a model from the t06 information. I attempted to add a small two strand sheet at the bottom of the helix. This two strand sheet didn't really form well, because the predictions for the model didn't allow enough space, but there was enough room for a beta bridge and a hairpin that was predicted from the notor alphabet. T0335.try25-opt2.pdb is a polished model of the best Robetta server. I liked the way that the strands were formed in the predicted region of the protein. However, the hairpin didn't match our predictions that well. It also doesn't score as well on our costfcn as some of the other models with the hairpin in place. T0335.try33-opt2.pdb is an attempt to make a four helix bundle out of this protein. One of our models (try8-opt2) had a three helix protein with a kinked helix, so I used ProteinShop to remove the helix from residues E36-G40. I then rotated the helices in order to pack residues I10 and L13 against L67. I sent it to undertaker, then when that was finished, I rotated the third helix in the bundle in order to bury some predicted buried residues, and I moved the helices closer together. I added some distance constraints to keep the helices as a bundle. The beta bridge was added in an attempt to pack the helices tighter together. The original try8-opt2 model that this came from was a polish of the best scoring SAM model. T0335.try29-opt2.gromacs0.repack-nonPC.pdb is the model we created that scores the best with the rosetta scoring. It was also originally created from try8-opt2 which was a polish of the best scoring SAM model, but I left the three helices together the way SAM created them. I used the gromacs0.repack-nonPC packing from try8-opt2 to optimize according to the rosetta scoring function. Try29-opt2 was the best model from these optimizations. ---- Try31-opt2 also scores very well on the unconstrained costfcn, but I chose to not include it because of the location of the HIS tag. It ended up creating a three helix bundle, but I'm not sure how much of a bundle it would be if the extra few residues at the end were not located there. However, I would not complain if we substituted one of the current models with this model. ------------------------------------------------------------ Wed Jul 26 19:23:55 PDT 2006 Kevin Karplus I submitted Grant's top 5 models with his comments. ReadConformPDB T0335.try21-opt2.pdb ReadConformPDB T0335.try6-opt2.pdb ReadConformPDB T0335.try25-opt2.pdb ReadConformPDB T0335.try33-opt2.pdb ReadConformPDB T0335.try29-opt2.gromacs0.repack-nonPC.pdb Sat Sep 9 18:43:59 PDT 2006 Kevin Karplus Our best model was try12-opt2.gromacs.repack-nonPC, which we did not submit. The best we submitted was model2 (try6-opt2), which had slightly better Hbonds but worse GDT and RMSD. The Zhang-Server_TS1 had the best prediction of any of the servers---beating anything we submitted, but not as good as try12-opt2.gromacs.repack-nonPC. I may have been wrong to suggest looking at the model without the HIS tag, as try12 *was* our best.