Wed Jun 7 09:28:22 PDT 2006 T0320 Make started Wed Jun 7 09:30:22 PDT 2006 Running on camano.cse.ucsc.edu Wed Jun 7 16:56:41 PDT 2006 Kevin Karplus We seem to have a good comparative modeling task for the N-terminal part, with excellent hit to 1sur, but it only extends up to about N228. We'll probably need to do subdomain modeling for N210-N306 (started on farm cluster). Note: blast puts 1sur first, but only with an E-value of 1.9, while the HMMs select it with E-value 2.7e-23 t06 alignment has strong conservation signal that does not quite match sequence, so there may be contamination. T04 shows same sequence logo, but t2k is quite different, having much more conservation near the C-terminus, and a better match of the sequence to the conserved residues. T2k, however, still likes 1sur best with a strong E-value (6.2e-26), so this will probably not affect the fold choice much. T06 moves 1gpmA to the top, with 1sur second, but 1gpmA has a domain with the same superfamily (c.26.2.*) as 1sur (Adenine nucleotide alpha hydrolases-like), and is within the same "group of families"---the first three families of the superfamily. Wed Jun 7 21:16:10 PDT 2006 Kevin Karplus try1-opt2 and the models from the alignments are pretty much in agreement for E32-C214, but the first 30 and last 110 residues are a bit iffy. The N210-N306 subdomain has not run through undertaker yet, but the HMMs found no good hits for it (1twyA at E-value 6.7 is the best). Blast liked 1ak0 with E-value 0.239 best, but none of the HMM searches liked it at all. Wed Jun 21 16:11:00 PDT 2006 Firas Khatib Robetta used 1zunA as the parent, but the HMMs say it only has an e-value of 1.9e-03 compared to 1sur's 2.7e-23 Robetta also has 10 models for the subdomain: 237-306 and these are not in decoys/servers, but we can get them from the website if we want to compare them. Thu Jun 22 15:45:18 PDT 2006 Firas Khatib well, this subdomain is very tough. The 3 first Robetta models all align residues 274-286 in the same way, but other than that I don't see any similarity. The secondary structure predictions seem to vary between T06/T04 and T2k: for example, for T06 & T04: around residues 248-252 the o_notor and n_notor predict antiparallel strand, as does dssp-ehl2 (very strongly) but t2k predicts helix around there (not very strongly predicted of course) but Rosetta has all helix everywhere (although that is no surprise) So this protein is a Flavin adenine dinucleotide (FAD) synthetase according to the CASP website, maybe we should be looking for FADs that might be similar? Using Blastp on the website and "related structures" there are a few alignments that might work for certain segments: For example: residues 216-232 align to 420-436 in 1ctn with seq. identity of 41% looking just at those residues in the Undertaker model it's a sheet, whereas in the Robetta models it is indeed helix, so I think I will put in helix constraints for 216-232 (except N229). There is a 48% sequence identity with 2cyx (residues 97-123) for 251-277 seems very helical, but try1 has it as strands as well. Robetta has it (residues 19-46 in Robetta models' numbering) as helical or coil (except model 10). so I was thinking of making helix constraints for K257-R263 and T269-K277 Fri Jun 23 17:29:31 PDT 2006 Firas Khatib Looking at the other related structures: 2bjh,1n08,1c0i there seems to be some agreement for Helix contraints here from D258-N286 (with many of the agreement overlapping, especially S265-K277, so I will try with strong helix constraints there) Some predict strands for L245-F255, so I will try that as well. For try2 I have added these: HelixConstraint D258 N286 0.7 HelixConstraint S265 K277 0.9 HelixConstraint L216 P232 0.6 StrandConstraint L245 F255 0.6 and kept this dssp-ehl2 constraint: HelixConstraint T297 R300 0.6498 this is try2 running on lopez. try3 will have the same, but with an added strand constraint that I am not certain of: StrandConstraint Y288 V294 0.9 I will give it a strong weight to see what happens! This is try3 runnning on lopez. try4 will have a helix there instead: HelixConstraint N286 R300 0.9 try4 is running on shaw Sat Jun 24 16:11:43 PDT 2006 Firas Khatib I just realized that I should be discussing this subdomain in the other README: casp7/T0320/N210-N306/README so I will continue over there! Mon Jun 26 14:05:26 PDT 2006 Firas Khatib we are running try2 in the main directory, using the constraints from the subdomain from try8 with the T0320.undertaker-align.sheets as well. try2 is running on lopez Mon Jun 26 18:29:29 PDT 2006 Firas Khatib try3 is the same using try1-opt2.sheets & try1-opt2.helices instead running on camano Tue Jun 27 10:56:15 PDT 2006 Kevin Karplus Please put all discussion of the model (including of subdomain optimization) in the main README file---I may not see the stuff in the subdomain READMEs and thus miss important information. It is hard enough to keep track of 45 targets at once, without also keeping track of all their possible subdomains. try3-opt2.repack-nonPC is the best-scoring model with the try3 costfcn, not entirely due to constraint scores. Is there going to be a chimera made from the subdomains and optimized also? Tue Jun 27 11:01:54 PDT 2006 Kevin Karplus Here is the N210-N306/README file: Wed Jun 7 16:58:33 2006 split-into-domains created subdirectory for N210-N306 of T0320 Make started Wed Jun 7 17:00:35 PDT 2006 Running on farm09.cse.ucsc.edu Sat Jun 24 16:12:44 PDT 2006 Firas Khatib so try2 scores best, followed by try3 and try4 (but none of them score THAT much better than try1) looking at try2 it was not able to use all the constraints. For example, I put in: HelixConstraint L216 P232 0.6 and this became a strand. (presumably so that it could pair up with: StrandConstraint L245 F255 0.6) I like the helix in try3 (from D258-N286) so I will increase that constraint from 0.7 to 0.9 I have taken try2 and increased the constraints to 50 (from 20) and the weights to: HelixConstraint D258 N286 0.9 HelixConstraint S265 K277 0.9 HelixConstraint L216 P232 0.7 StrandConstraint L245 F255 0.6 HelixConstraint T297 R300 0.6498 try5 is running on lopez. in try3 I like the way that Y288-Y292 could form a strand with L245-F255 I will try a similar run to try2 but with those 2 strand constraints higher and if see if they connect, otherwise I will add h-bond constraints to bring those 2 potential sheets together. this will be try6 on lopez with the increased constraints from 20 to 50 HelixConstraint D258 N286 0.9 HelixConstraint S265 K277 1.0 HelixConstraint L216 P232 0.6 StrandConstraint L245 F255 0.9 HelixConstraint T297 R300 0.6498 StrandConstraint Y288 V294 0.9 I don't like try4 except the final helix from N286-R300, in future I think I will increase that constraint Mon Jun 26 11:49:36 PDT 2006 Pinal Kanabar starting with try5, we did not like the breaks in the chain so we increased BREAKS from 50 to 80. We also increased CONSTRAINTS from 50 to 80 and got rid of an overlapping constraint (265-277) and added a strand constraint: StrandConstraint P234 F255 because it seemed like 234-244 could create a sheet with 245-255... we'll see if that works. We also added: StrandConstraint Y288 V294 because we liked it in try3 and it seemed to work well in try6 (but the rest of try6 did not score well) so we will see how this does in try7 (running on camano) for try8 : we wanted to keep the try6 model and make just 1 change: HelixConstraint D258 K277 1.0 We kept everthing same except that we removed the overlaping constraint. try8 is running on lopez. Mon Jun 26 12:18:01 PDT 2006 Firas Khatib so try7 was awful, we have increased the try7 constraints from: StrandConstraint P234 F255 6 StrandConstraint Y288 V294 6 StrandConstraint P234 F255 10 StrandConstraint Y288 V294 8 hopefully this will help. try9 is running on orcas Tue Jun 27 15:58:24 PDT 2006 Firas Khatib I took residues 1-210 from try3-opt2.repack-nonPC and 211-306 from try8-opt2 in the subdomain directory. I will run it through undertaker to get rid of the few clashes that exist (according to Proteinshop) we are running this as try4 on shaw Tue Jun 27 18:26:13 PDT 2006 Kevin Karplus try4-opt1 scores well with the try4 costfcn, which is promising. Unfortunately, it also has huge holes in the C-terminal domain. Tue Jun 27 18:34:32 PDT 2006 Firas Khatib I am very surprised with the gaps, since we have breaks set at 100 Tue Jun 27 18:49:16 PDT 2006 Firas Khatib I am going to launch 2 more chimeras, both using residues 1-210 from try3-opt2.repack-nonPC and 211-306 from try5-opt2 & try2-opt2 in N210-N306 so that we can try to have a variety of different models for domain 2 this is the hierarchy for the N210-N306 tries: try2 -> try5 -> try7 try3 -> try6 -> try8 try4 if there is a free CPU I will include a chimera of try4 (but I don't think it's right) Tue Jun 27 19:20:56 PDT 2006 Firas Khatib try5 is running on lopez and uses chimera-try3andtry5combined.pdb which combines try3-opt2.repack-nonPC (residues 1-211) and try5-opt2 (res 212-306) Tue Jun 27 19:28:56 PDT 2006 Firas Khatib try6 is running on orcas and uses chimera-try3andtry2combined.pdb which combines try3-opt2.repack-nonPC (residues 1-211) and try2-opt2 (res 212-306) Tue Jun 27 19:33:05 PDT 2006 Firas Khatib try7 will use try4 from the 2nd sub-domain (just because it is very different) it will be try3-opt2.repack-nonPC (residues 1-219) and try4-opt2 (res 220-306) it is running on orcas using chimera-try3andtry4combined.pdb Hopefully these will all be diverse enough to submit. Tue Jun 27 20:14:10 PDT 2006 Kevin Karplus The problem wasn't chain breaks, but poor packing so that there were large water-filled cavities. In fact, setting breaks so high may be discouraging it from trying reasonable solutions---once it closes a gap, it becomes quite rigid. Tue Jun 27 22:40:20 PDT 2006 Kevin Karplus None of the models look particularly good. I need to pick some to submit tonight, and I can replace them in the morning if Firas picks other ones. Based on scoring with unconstrained.costfcn and try8.costfcn, I'll pick try4-opt2 try3-opt2.repack-nonPC try1-opt2 try6-opt1 try7-opt1 Tue Jun 27 23:00:42 PDT 2006 Kevin Karplus So submitted. We'll probably want to at least switch the last two to opt2 versions. Wed Jun 28 00:05:35 PDT 2006 Firas Khatib If setting breaks so high might have been the problem, I will rerun try4, try5, try6, and try7 with breaks set at 50 instead of 100. try4 will become try9 and is running on lopez try5 will become try10 and is also running on lopez try6 will become try11 and is running on shaw try7 will become try12 and is running on shaw Wed Jun 28 08:13:52 PDT 2006 Kevin Karplus try8 did not get run last night (I fell asleep before try6 and try7 finished), so I will run it now, to polish up the existing models (including try 9-12). try8 running on cheep. Currently, rosetta hates all the backbones for repacking sidechains, but the least bad are try5-opt2, try11-opt2, try10-opt2. The try8 costfcn prefers try9 < try4 try11 < try6 try3-opt2.repack-nonPC try1 try7 < try12 try2 try10 < try5 try8 will probably just polish up try9-opt2, but may use crossover to get some good feature from a different model (unlikely in this case) I will change the current submission to ReadConformPDB T0320.try9-opt2.pdb ReadConformPDB T0320.try11-opt2.pdb ReadConformPDB T0320.try3-opt2.repack-nonPC.pdb ReadConformPDB T0320.try1-opt2.pdb ReadConformPDB T0320.try7-opt2.pdb Wed Jun 28 09:20:01 PDT 2006 Firas Khatib hopefully try8 will finish in time, and score the best! Wed Jun 28 10:55:47 PDT 2006 Kevin Karplus Submitting try8-opt1 as the new first model. Wed Jun 28 13:13:21 PDT 2006 Kevin Karplus try8 finished too late to submit try8-opt2.