Tue May 16 08:56:43 PDT 2006 T0289 Make started Tue May 16 08:57:48 PDT 2006 Running on shaw Make started Tue May 16 09:08:24 PDT 2006 Running on shaw Tue May 16 09:28:59 PDT 2006 Kevin Karplus Had to restart, because the initial sequence downloaded from the web site had a formatting error (an extra space after the '>') This seems to be a comparative model, as the t06 alignment has 12 pdb sequences in it. Tue May 16 10:22:40 PDT 2006 Kevin Karplus BLAST at NCBI (http://www.ncbi.nlm.nih.gov/BLAST) reports this as being in pfam04952 (AstE_AspA): AstE_AspA Succinylglutamate desuccinylase / Aspartoacylase family. This family includes Succinylglutamate desuccinylase EC:3.1.-.- that catalyses the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. The family also include aspartoacylase EC:3.5.1.15 which cleaves acylaspartate into a fatty acid and aspartate. Mutations in human Aspartoacylase lead to Canavan disease. This family is probably structurally related to pfam00246 (Bateman A pers. obs.). but provides no strong matches (best E-value 0.86) The t04 multiple alignment has 9 pdb sequences, and the t2k one has none, so this may qualify as fold-recognition, rather than comparative modeling---the homologs are a bit distant, so alignment will matter a lot. Top hits with the t06 and t04 w0.5 target models include 1h8lA 1uwyA 1yw6A 2g9dA 1yw4A 2bcoA 1cpb ... The t2k alignment, having no pdb sequences in the multiple alignment, finds fewer, with top hits 1yw6A not in template lib 2g9dA not in template lib 2bcoA 1yw4A and a huge increase in e-value before 1h8lA 1uwyA Unfortunately 2bcoA and 1yw4A are not in SCOP, so I can't verify that they are the same SCOP classification as the others, but VAST has the most similar structures to 2bcoA being 2bcoB, 1yw6A, 1yw4A, 1jqgA, 1zg7A, 1dtdA, ... Both 1h8lA and 1uwyA have very similar structures (pvalues less that 1.e-8). The 2-track HMMs from t2k favor 2bcoA and 1yw4A highly, with a large increase in Evalue before 1h8lA, 1qmuA, 1jqgA, 2bo9A, ... Make started Thu May 18 15:14:25 PDT 2006 Running on lopez.cse.ucsc.edu The initial run died in the power failure, so I cleaned up the junk and restarted make. Thu May 18 15:19:13 PDT 2006 Kevin Karplus The top hits seem to be 1dtdA and 1m41A, which were not the top hits with the w0.5 models. It might we worth doing a blast search of pdb and record the top few hits. Thu May 18 18:10:42 PDT 2006 Kevin Karplus This is the one target for which the blastall crashes. We need to install a more recent version of NCBI-blast. Make started Thu May 18 23:56:52 PDT 2006 Running on cheep.cse.ucsc.edu The remake that I started on lopez died, because there was a gzipped empty Template.atoms file. I'll try running the make again. Fri May 19 07:57:05 PDT 2006 Kevin Karplus The try1-opt2 model looks pretty good. There is an exposed hydrophobic patch that looks like a dimerization interface at L45-T52. We may want to pick up the full biological unit from the homologs and superimpose on that to optimize as a dimer. There is some confusion about the secondary structure from E284 on (str2 disagrees with other predictors). It might be worthwhile to do a subdomain for P215 to the end, to see if we can pick up any more signal without the first part. Fri May 29 11:55:04 PDT 2006 Grant Thiltgen I checked the SCOP domain and fold for the two top hits for fold recognition (c.56.5.1), and it has a core of mixed sheets, which the try1-opt2 model appears to match well. I looked at some of the matching PDB files for fold recognition and the top two matches for the blast hits. The best score for the fold recognition (1dtdA) appears to match extremely well. The second hit (1m41A) appears to be a barrel. The best score for the BLAST search (2g9dA) actually appears to match structure with the try1-opt2 and with the top hit for fold recognition. I'm a bit concerned with the sheets trying to fold into a barrel at the end, and one or the other may be a construct of the the two different fold recognition results. I also checked the PDB file for the best fold recognition hit, and I can't really tell, but I think it might be a multimer of some kind, but I need to check further into it. Fri May 29 14:35:09 PDT 2006 Grant Thiltgen Ooops! I need to tell the difference between a "1" and an "l" So the folds of the two fold recognition hits are the same, but I still don't really like the end, which is probably a separate domain. Sat May 20 08:27:11 PDT 2006 Kevin Karplus It loks like Grant started doing subdomain predictions yesterday afternoon (16:06) for S1-F214 and P215-H312, but the P215-H312 one seems to have crashed before writing out the models, and Grant did not make the directory writable, so no one else can fix it. He also did not direct the output of the make to a log file, so we can't see for example, why the rr files are missing. Oops---yes we can. It looks like the sudomain make is creating Makefiles that point to the pce/starter-directory instead of the casp7/starter-directory, so are getting the slightly obsolete Make.main. I'll have to fix that. Sat May 20 08:45:42 PDT 2006 Kevin Karplus OK, I fixed the Makefile and the casp7/scripts/split-into-domains to use the new Make.main for subdomains. We have to be careful, as some of the old Makefiles still point to the old split-into-domains script. I moved Grant's efforts to Grant-S1-F214 and Grant-P215-H312 and started new S1-F214 and P215-H312. (Grant, you can delete your directories if you want--no one else can as long as they are not group-writable.) Sat May 20 12:38:33 PDT 2006 Kevin Karplus For P215-H312, 2bcoA is coming up as the top hit, but with an E-value of 7.5. None of the hits are labeled as containing c.56.5.1 (the N-terminal domain). For S1-F214, 1dtdA and 1m4lA are the top two hits, followed by 2bcoA, 1yw4A, 2b09A, 1h8lA, ... . E-values start around 5.7e-18 The only intersection between the best hits list for the two domains is 2bcoA, which scores well with both. It looks like 2bcoA is the right template for the whole protein, but 1dtdA and 1m4lA are better for the N-terminal domain. Sat May 20 15:32:56 PDT 2006 Kevin Karplus The P215-H312 try1 run did *not* select the sandwich from 2bcoA. Instead it seems to have picked a barrel from 1c9oA. It may be worthwhile to make a chimera with this c-terminal domain and a good Nterminal model, though I suspect that the 2bcoA template is a better bet. Sat May 20 19:01:40 PDT 2006 Kevin Karplus Looking at the superposition of the best alignments with try1-opt2, it looks like 2bcoA and 1yw4A have the same C-terminal domain. The alignments look better than try1-opt2 in this region. We may be getting some messing up from 1h8lA, which has a somewhat different C-terminal domain in a different place. Mon May 22 14:51:15 PDT 2006 Grant Thiltgen I started a run of the whole protein excluding the all-align.a2m file and using the alignments from only 2bcoA to see if we could force the protein to follow that template. [This must be try2---KJK] I have it automatically set up to make things that I create group readable, but not writeable. Is it best to run fixmode when I'm finished with a run, or is the makefile supposed to do that when it's finished? Since my files didn't finish running, did that part of the makefile not run? I also started a run for P215-H312 using the alignment from only 2bcoA. I thought I should run it to check if a chimera would be a better try than just the whole protein. [This must be P215-H312/try2---KJK] Mon May 22 21:47:22 PDT 2006 Kevin Karplus I picked up all the server results for this target and am scoring the server predictions with the try1 costfcn. (I first modified the try1 costfcn to have missing_atoms 1 as a component of the cost.) There is a fixmode run at the end of the default make, but if a job terminates early it might not get run, so a manual "fixmode ." is useful in such cases. There is probably not a fixmode for specific makes of non-default targets. Tue May 23 14:53:52 PDT 2006 Grant Thiltgen try2-opt2 is slightly better than try2-opt1. Although for the whole protein, the logfile wasn't gzipped and there were only the try2-opt1 and try2-opt2 pdb files, and they also weren't gzipped, so I think the process crashed somewhere along the way, and I don't know if it will rerun from where I left off. Also, I found the server predictions, but there doesn't seem to be any scoring of them, so I don't know if that crashed somewhere as well. try2-opt2 is looking at using only alignments from 2bcoA. Only looking at P215-H312, try1-opt2 scores better than try2-opt2 for this region of the protein. I'm not sure if that means a chimera might be better. I suppose I should try running the first part of the protein again with just the template from 2bcoA to make sure that combining the two might not be better. I'm going to try another run including 2bcoA and 1yw4A to see if maybe it can clean it up a bit. [This must be try3---KJK] I also googled Succinylglutamate desuccinylase and it does appear to be a dimer, which may mean we need to run it as a dimer next, just to make sure we have things working right. Tue May 23 16:42:37 PDT 2006 Grant Thiltgen GAH! I'm not sure why, but both the full protein and the first segment are having difficulties using the alignments from 2bcoA and both 2bcoA and 1yw4A. There's an error in the makefile and all the output explodes to the terminal. Tue May 23 21:29:05 PDT 2006 Kevin Karplus I'm remaking decoys/score-all+servers.try1.pretty I believe it did crash when I tried running it before, but I've made some fixes to undertaker, so it might be worth trying again. I'm not sure what Grant means by "the full protein ... are having difficulties using the alignments ...". Is there an error message in the log file, or is it just that undertaker prefers some other alignment? The correct way to run an optimization run (in either this directory of a subdomain) is (make -k T0289.do2 >& do2.log; gzip-9f do2.log) & (where the 2 is replaced by the number of the try*.costfcn). Tue May 23 22:03:31 PDT 2006 Kevin Karplus No, undertaker is still crashing when trying to read in the SCWRLed results from UNI-EID_expm_TS1 with messages: # Trying to read SCWRLed conformation from /var/tmp/from_scwrl_543885382.pdb undertaker: Segment.cc:95: int Segment::OK() const: Assertion `C_atoms[1] != C_atoms[2]' failed. I'll have to try debugging that later. Tonight I have to pack for my trip to LA. Wed May 24 14:10:43 PDT 2006 Grant Thiltgen Okay. I started to re-run some of the undertaker runs that crashed. I think they might be working okay now, but only time will tell! Thu May 25 13:50:24 PDT 2006 Grant Thiltgen Everything looks like it ran okay. For the full protein try2-opt2 still does better than try3-opt2. try1-opt2 still does best for P215-H312. I'm gonna mess around with dimerizing try2-opt2 to see what happens. Thu May 25 14:56:47 PDT 2006 Kevin Karplus I need to fix the undertaker crash reading in SCWRLed UNI-EID_expm_TS1. We may need to increase break weights and do a polishing run from multiple existing models. Thu May 25 15:00:36 PDT 2006 Kevin Karplus I changed superimpose-best.under to include the single-domain predictions in the superposition, to see if we want to create chimeras. Thu May 25 16:02:33 PDT 2006 Grant Thiltgen So I was messing around with creating a dimer, and unfortunately the templates used to make the protein don't create dimers in the same location as the dimer should be for this protein, so I'm not sure if creating a dimer is going to work well for it. Hmm. try2-opt2 scores better for S1-F214, but it puts the end that I would need to create the chimera and the wrong end of the protein, so I'm not sure how well that is actually doing. try1-opt2 does it too. I may be just sticking with the entire protein. Fri May 26 15:22:27 PDT 2006 Grant Thiltgen So far try2-opt2 for the whole protein works best, so I'm going to try an optimization run to get rid of some of the breaks. Fri May 26 18:47:40 PDT 2006 Kevin Karplus The S1-F124 try1-opt2 matches the whole-chain try2-opt2 very well out to about F204. The P215-H312 try1-opt2 does not match at all to the whole-chain try2-opt2. I wonder if it is more or less sensible than the whole-chain prediction. Mon May 29 13:18:54 PDT 2006 Kevin Karplus The problem that is crashing undertaker is with conformations that have two atoms in exactly the same place (an O and an N in the next residue at the same spot in the current crash). I think that the problem is in the input files, but I thought that I detected that problem and marked the offending atoms. I *was* only looking for identical adjacent atoms, though and O and N aren't adjacent, so if the intervening C is different, ... . Another possibility is that some transformation caused the missing-atom bit to be lost. Tue May 30 12:57:53 PDT 2006 Grant Thiltgen I tried upping some of the burial restraints in the first line to make the protein less foamy. It also looks like the new model works well, but there are still some chain breaks. Is that going to be okay for submission, or should I work on getting rid of them? The main problem I have with the P215-H312 try1-opt2 is that P215 is slightly inaccessible to link up to the other half. Also the end of the protein that needs to be linked up is going off on the wrong direction. It might be a bit late, but would it be beneficial to try a run from residues 1-200, then 211 on? Tue May 30 14:33:39 PDT 2006 Grant Thiltgen I tried running do5, but the current makefile is set up to run the buggy version of undertaker. Can the current makefile be temporarily changed in order to run the non-buggy version of undertaker? Wed May 31 14:46:21 PDT 2006 Grant Thiltgen I ran try5 with increased weights for wet6.5, dry5, dry6.5, and dry8. The scores slowly seem to get better. The second half of the molecule still seems really foamy, so I pulled the conserved residues for just that part of the molecule with the P215-H312 output. There wasn't any in the hole in that part, but there were some within the groove so it leads me to believe that the results are mostly good. There are some conserved residues sticking out in to the solvent area, which may or may not need to be moved. I also started runs for new domains for the first 200 residues and the rest of the protein, just to check to see if that makes a better chimera than trying to mix the other split we made. I might try another run of the whole protein with the sidechain results increased. [That must be try6---KJK] Thu Jun 1 07:37:56 PDT 2006 Kevin Karplus It is past time to clean up the constraints in the costfcn---Grant is still running with the automatically generated constraints. Also, we need to add missing_atoms to the costfcn for scoring the server models. try6-opt2 looks good out to about I219. I'm still a bit dubious about the C-terminal domain. Thu Jun 1 08:58:11 PDT 2006 Kevin Karplus Fixing undertaker to handle such messed up inputs as servers/UNI-EID_expm_TS1 looks difficult, so I have tried commenting it out of the read-pdb+servers.under file to try to get scoring for the other files. Thu Jun 1 09:16:44 PDT 2006 Kevin Karplus Foo! now undertaker is dying trying to score karypis.srv.4_TS1-scwrl Is SCWRL returning duplicate points??? Or is undertaker messing up on reading back SCWRL results? OK, the karypis.src.4_TS1 backbone is ugly, but it shouldn't be killing undertaker. Thu Jun 1 09:36:55 PDT 2006 Grant Thiltgen I guess I'm not really sure which constraints to add or take out of the costfcn to change and improve the model. I agree with the C-terminal domain problem. Unfortunately, the groove where the active site is probably located overlaps with the good and bad parts of the protein. Thu Jun 1 14:23:20 PDT 2006 Grant Thiltgen I talked with George about some of the constraints. He gave me some good suggestions on how to look for which constraints to add for sheets. I'm running try7 without the sheet constraints or the rr constraints to see what may happen with the C-terminal region. George seems to think that the first 200 residues are matching up well enough that removing the constraints should be okay. I'm also going to set up try8 with the original sheet constraints and some new constraints for some sheets that look like might be in the C-terminal region based on the ehl2 composite information. Fri Jun 2 10:49:00 PDT 2006 Grant Thiltgen try7 seems a bit worthless, and try8 didn't turn out all that great, but I think I can tweak some of the constraints from try8 and work on the C-terminal region of the protein. I started try9 with some new strand and sheet constraints. Make started Fri Jun 2 15:55:03 PDT 2006 Running on vashon.cse.ucsc.edu Sat Jun 3 17:05:54 PDT 2006 Kevin Karplus I made decoys/score-all+servers.unconstrained.pretty The top scorer is try6-opt2. Other than SAM_T06_server_TS1, the next highest scorers are ROBETTA_TS[25341], then RAPTORESS_TS1-scwrl. The score change is pretty large down to raptoress. Sun Jun 4 17:22:01 PDT 2006 Grant Thiltgen I ran try4 for P215-H312, and I'm not too sure it's all that great, but I can try working with it for a bit. I realized that I'm still running with just the two top models, so I am going back to getting more fragments to model the end with the modified costfcn. I also tried to make some new constraints for try11 with the n_notor and o_notor results. Mon Jun 5 11:21:08 PDT 2006 Grant Thiltgen That C-terminal domain is gonna drive me nuts. Undertaker really likes to do the same thing with the end even when I remove the constraints and add new ones in. Blah. try12 will attempt to manually define all the strands and helixes and see how undertaker wants to pack it. Mon Jun 5 15:52:21 PDT 2006 Grant Thiltgen try12 isn't really improving much. The last few runs I ran all kind of do the same thing and score similarly. I can't seem to get the residues to make a helix when I want them to. Blah. I'm going to try try6 for just the C-terminal with the same constraints for the helices and see if that helps. Tue Jun 6 09:40:31 PDT 2006 Kevin Karplus I'll try making decoys/score-all+servers.try12.pretty to see if any of the servers are doing what Grant wants the protein to do. The problem may be that he has put a large weight on the sheet constraints and a tiny weight on all other constraints, so that the other constraints are almost irrelevant. He also had typos in some of the added constraints (reported as errors when the constraints were read, but Grant must not have checked for error messages). Making decoys/score-all.try12.pretty with the output to an emacs buffer or a file is one fairly quick way to see the error messages for the try12 costfcn. Looking at try12-opt2, the helix L85-F99 is upside down, with the buried face exposed. The conserved E88 should probably be in the active site with the other conserved charges, perhaps with E88.OE1 near N53.ND2 (though probably not close enough to hbond). The CB atoms of A92 and I95 should probably be near (say <7 Ang) the CB atoms of F50 and F16. For try13, I'll try adding constraints to orient this helix properly and adjust the sheet constraints to be less powerful. Tue Jun 6 10:10:01 PDT 2006 Kevin Karplus I threw out decoys/score-all+servers.try12.pretty, because try12.costfcn did not include missing_atoms, so would be highly misleading about servers that gave incomplete results. (Despite that, none of the servers except robetta scored well.) I am making a decoys/score-all+servers.try13.pretty, which will probably favor try11-opt2 a lot, since the sheet constraints were taken from there. Tue Jun 6 10:28:45 PDT 2006 Kevin Karplus The best-scoring model with try13 is indeed try11-opt2, and SAM_T06_server_TS1 is the best-scoring server model, ROBETTA_TS5 next (way down the list) and RAPTORESS_TS2-scwrl after that. Oops, I have to remake decoys/score-all+servers.try13.pretty, since I did not notice Grant's typos in the constraints he added. I've fixed them in try13, and will rescore the server models. Looking at the superposition in best-models.pdb.gz, based on ReadConformPDB T0289.try11-opt2.pdb ReadConformPDB T0289.try7-opt2.repack-nonPC.pdb InfilePrefix P215-H312/decoys/ ReadConformPDB T0289.try1-opt2.pdb InfilePrefix A201-H312/decoys/ ReadConformPDB T0289.try1-opt2.pdb InfilePrefix S1-F214/decoys/ ReadConformPDB T0289.try2-opt2.pdb InfilePrefix decoys/servers/ ReadConformPDB SAM_T06_server_TS1.pdb ReadConformPDB ROBETTA_TS5.pdb ReadConformPDB RAPTORESS_TS2.pdb InFilePrefix ReadConformPDB T0289.undertaker-align.pdb model 1 ReadConformPDB T0289.undertaker-align.pdb model 2 ReadConformPDB T0289.undertaker-align.pdb model 3 ReadConformPDB T0289.undertaker-align.pdb model 4 ReadConformPDB T0289.undertaker-align.pdb model 5 I see that there is general agreement about the sheets *except* for Y287-T297. Perhaps I should remove the sheet constraints for that region, and just use Strand constraints, to avoid biasing the model selection too much. Tue Jun 6 11:13:43 PDT 2006 Kevin Karplus OK, I modified try13.costfn, rescored everything with it, and am now running try13 (from the original alignments, like try1) on cheep. Tue Jun 6 12:35:59 PDT 2006 Grant Thiltgen I ran try6 and try7 for just the C-terminal region. I think try7 might be something I might be able to work with a chimera. We'll see. Also, I ran try2 with the the A201-H312 region to see if a little bit of overlap might help, and it appears that I might be able to try to make another chimera from that. Tue Jun 6 14:31:43 PDT 2006 Kevin Karplus try13-opt2 scores almost as well as try11-opt2. The constraints are not quite so well met (not surprising, since the constraints were taken from try11-opt2), but the other scores a better. The difference in total cost is less than the differences in constraints. Rosetta still doesn't like repacking this model. It prefers try7-opt2, which I think looks terrible. Somewhat surprisingly, try13-opt2.repack-nonPC scores better than try13-opt2 Grant does not seem to have put his chimeras into the decoys directory (where they belong), so they are not getting scored in the score-all scripts. Tue Jun 6 14:41:10 PDT 2006 Grant Thiltgen Gah! I can't seem to get the two models I want to superimpose to superimpose at residue 215 in order to create a model that works well. I'd like to be able to get them to superimpose there, and maybe use protein shop to remove some clashes before polishing it up with undertaker, but when I use the command PrintAllConformPDB make-chimera.pdb atom P215 superpose in the undertaker script that I used "make-chimera.under", the residues are still far apart. I also can't move them closer together in ProteinShop (I tried). Tue Jun 6 15:07:16 PDT 2006 Grant Thiltgen Well, I got it to overlap at around residue 218 and 219 instead of 216, so I'll try to chop the protein there and see what I can do with it. Tue Jun 6 15:38:01 PDT 2006 Grant Thiltgen I'm going to run try14, which is attempting to work with the first chimera I made. I'm hoping to go through an optimization run to see how well undertaker can work with the chimera I made: T0289.try11-opt2-chimera-try7-opt2-C-terminal.pdb Tue Jun 6 15:55:22 PDT 2006 Kevin Karplus The reason that using undertaker to create the chimera was not working well is two-fold: 1) the atoms specified just give an initial superposition of the two conformations, which are then reoptimized to overlap as well as possible. So the overlapping regions of the two predictions can override the initial superpostion. This can be reduced by truncating the predictions so that only the residues used for aligning the domains are left. (I did half this job in try11-opt2-chopped-to-C217.pdb 2) the connection that Grant was specifying (around P216) would result in the two domains colliding badly. So even if the splicing was done right, the resulting chimera would be a mess. What happens if we try superposing D220, V221, Y222, K223? Tue Jun 6 16:16:12 PDT 2006 Kevin Karplus D220, V221, Y222, K223 makes a pretty crummy crossover, but Q247, D248, Q249 looks like it might work. Tue Jun 6 16:21:53 PDT 2006 Kevin Karplus Nope, bad idea. Probably the best thing to do at this point is for Grant to work with Firas on using ProtienShop to place the second domain relative to the first where he wants it---the structures are too different to make an easy crossover just by lining up a fragment. At this point, probably the most valuable thing to do is to polish up thr try11/try13 line to close gaps and remove clashes. Tue Jun 6 16:30:36 PDT 2006 Kevin Karplus Grant's T0289.try11-opt2-chimera-try7-opt2-C-terminal.pdb is intriguing, and worht pursuing, but I don't think that try14 will clean it up. I think that some sheet constraints are needed to hold the two domains together as undertaker tries to close the gaps. Still, we can judge that when try14 is half-finished and has produced a try14-opt1 model. Nope, Grant read in ALL the models, not just his chimera, so he'll end up polishing something else--probably try6. Tue Jun 6 16:38:05 PDT 2006 Grant Thiltgen Whoops! Sorry about that! I still don't realize what needs to be commented out sometimes. I can start try15 with some sheet constraints to hold in the gaps, and make it so it only works with the one pdb file. Tue Jun 6 17:52:21 PDT 2006 Kevin Karplus try15.costfcn doesn't actually have any constraints: # SheetConstraint Error: residue specified as P151 doesn't match (T0289)C151 # Error: can't parse residue name in position0 I have found it useful to do a make decoys/score-all.try15.pretty after creating a new costfcn, to make sure I don't have any typos like this in the constraints. Tue Jun 6 21:39:03 PDT 2006 Grant Thiltgen Ah! I see what happened. When I saved the PDB file from protein shop it renumbered them starting with 2. All the numbers are off by one. I went ahead and fixed that and I'm running a new run try16 with the constraint to hold the two strands together. I'm also including some of the other old constraints to keep it from getting to crazy as well. Wed Jun 7 11:47:38 PDT 2006 Grant Thiltgen try16 didn't seem a whole lot better than try15. I'm starting try17 to see if I can get that sheet between the two domains to form. Wed Jun 7 17:18:32 PDT 2006 Grant Thiltgen try17 is finished. It really doesn't seem much different than try 16. I'm not sure that the chimera is the way to go. It may be better to try to use some of the better models from fold-recognition (try11, try13, try6) than pursue this. It seems even foamy than before, and I'm not really sure how to get that sheet to line up better. I'll give it another go this afternoon though before I make sure it's not going to work. I'm also going to try to pull in the helices that seem to stick out. I'd also like to clean up some of the breaks in the chain in some of the models, but I'm not sure how to get undertaker to do that. I've tried increasing the weight of breaks in the costfcn, but I'm not sure if that actually works. Wed Jun 7 23:54:54 PDT 2006 Grant Thiltgen try18 finished, and the sheet still isn't right. I don't know what else I can do to fix that. The runs based on the chimera (try18, try17, try16, and try15) seem to score better using the unconstrained cost function, but they don't seem to be that much improved over try6 (which is a refined model of the early stuff direct from undertaker.) Maybe the chimera is slightly better, but I'm not incredibly sure if it's better or how to improve it with the sheets and the dangling helices at the end. I'm also running try19 to try and refine try11 and try20 to refine try13. Thu Jun 8 09:35:04 PDT 2006 Grant Thiltgen I'm starting one more run on the chimera to try to get that sheet where it should be: try21. I'm taking out all constraints except the one for the sheet, which I'm gonna turn up the weight on a bit. After polishing the other two models with try19 and try20, the unconstrained costfcn still has try18 at the top. Thu Jun 8 14:02:45 PDT 2006 Grant Thiltgen try21 appears to gets the sheet made, but it tore apart part of the original sheet. I started try22 which used the template from try18 with the original sheet constraints. I'm also planning on running try23 with the new template when try21 finishes to try to repair ths sheet. The chimeras are still scoring well with the unconstrainted costfcn, but I'm still not completely sure that it is better than the targets before the chimera. Thu Jun 8 16:52:21 PDT 2006 Kevin Karplus try21-opt2 looks a tiny bit better than try18-opt2, but actually scores worse with unconstrained.costfcn. We'll submit ReadConformPDB T0289.try6-opt2.pdb ReadConformPDB ROBETTA_TS5.pdb ReadConformPDB T0289.try21-opt2.pdb ReadConformPDB T0289.undertaker-align.pdb model 1 ReadConformPDB T0289.undertaker-align.pdb model 4 I'm running a polishing run (try23 on cheep) to fix up the ROBETTA models. Grant will do a polishng run on try6-opt2 (try24). The polished versions will replace the others when done. Fri Jun 9 07:40:21 PDT 2006 Kevin Karplus I replaced the ROBETTA model last night and will replace try6 with try24 this morning, resulting in ReadConformPDB T0289.try24-opt2.pdb ReadConformPDB T0289.try23-opt2.pdb ReadConformPDB T0289.try21-opt2.pdb ReadConformPDB T0289.undertaker-align.pdb model 1 ReadConformPDB T0289.undertaker-align.pdb model 4 Fri Jun 9 07:45:43 PDT 2006 Kevin Karplus email submission done. Date: Thu, 8 Jun 2006 20:52:25 -0700 (PDT) From: Grant Thiltgen To: Kevin Karplus Subject: undertaker runs finished try22 (which is another version of try18) and try24 (the refinement of try6) are finished. I remade score-all.unconstrained.pretty to check out the scores. It appears that try22 is slightly better than try18, but not much, so try18 is probably okay to go with. Also, try24 is slightly better than try6. I'm not sure it's much of an improvement, but we can submit that one instead. G. ------------------------------------------------------------ Fri Jun 9 10:06:49 PDT 2006 Kevin Karplus OK, try21-opt2 will be replaced by try22-opt2, since both undertaker and rosetta like it better. DONE. Submissions are now: ReadConformPDB T0289.try24-opt2.pdb ReadConformPDB T0289.try23-opt2.pdb ReadConformPDB T0289.try22-opt2.pdb ReadConformPDB T0289.undertaker-align.pdb model 1 ReadConformPDB T0289.undertaker-align.pdb model 4 Tue Jul 11 11:42:44 PDT 2006 Kevin Karplus The REAL_PDB file is 2gu2A. I'm running an evaluation of the servers and our models. It looks like our server beat our manual predictions, and that try23-opt2 made ROBETTA_TS5 worse, not better. try22-opt2 is slightly better than try21-opt2, but try22-opt2.gromacs0 is better still. try24-opt2.gromacs0 would have improved on try24-opt2, but still not gotten it to the level of SAM_T06_server. The best server model appears to be RAPTORESS_TS1, with our server model about 7th among the TS1 models: RAPTORESS_TS1 Zhang-Server_TS1 FAMSD_TS1 SP4_TS1 SPARKS2_TS1 ROBETTA_TS1 SAM_T06_server_TS1 None of our hand tries were as good as our server model. Our best was probably try8-opt2.gromacs0, with an all-atom RMSD of 8.2 and GDT of 37%. The best we submitted was model 2 (8.8 all-atom RMSD and GDT of 38.1% based on ROBETTA_TS5, which did better). RAPTORESS_TS1 had 7.0 RMSD and GDT of 42.8% I think I need to put some weight on GDT and smooth_GDT and reduce the weight on missing_atoms. Looking just at GDT, the best model is RAPTOR_TS2 (44.4%) and SAM_T06_server is the 27th TS1 server. This may be a more valid indication of how it is doing on the modeling than the RMSD-heavy evaluation I've been using. Fri Jul 14 11:00:48 PDT 2006 Kevin Karplus The decoys/evaluate.unconstrained.pretty file shows both the undertaker costs (unconstrained) and real costs, which combine clens, log_rmsd, log_rmsd_ca, GDT, smooth_GDT, and missing atoms. The mising_atoms weight in the real cost is misreported in the header, since it appears in both cost functions with different weights. RAPTOR_TS2 does the best, but of the TS1 server models, RAPTORESS_TS1 comes out on top. SAM_T06 is 18th of the servers for TS1 models---adequate, but not great. GDT values are only around 40%, with RAPTOR_TS2 getting 44%, so no one nailed this one. Our best model is try23-opt2.gromacs0, which we did not submit, but our model2 (which was try23-opt2) is almost as good.