Wed May 17 09:19:58 PDT 2006 T0291 Make started Wed May 17 16:20:47 PDT 2006 Running on shaw Wed May 17 19:07:09 PDT 2006 Kevin Karplus The t06 alignment has 226 pdb templates in it! Wed May 17 19:35:40 PDT 2006 Kevin Karplus RPSblast identifies the protein as tyrosine kinase: cd00192, TyrKc, Tyrosine kinase, catalytic domain. Phosphotransferases; tyrosine-specific kinase subfamily. Enzymes with TyrKc domains belong to an extensive family of proteins which share a conserved catalytic core common to both serine/threonine and tyrosine protein kinases. Enzymatic activity of tyrosine protein kinases is controlled by phosphorylation of specific tyrosine residues in the activation segment of the catalytic domain or a C-terminal tyrosine (tail) residue with reversible conformational changes.. The best BLAST hits in pdb are 1jpaB 1mqbB 1qcfA 2ptk 1y57A 1jpa[AB] is 79% identical and 90% positive with two 1-residue gaps over 234 residues. 1mqb[AB] is 72% identical and 84% positive with 1 1-residue gap over 214 residues. These two are much better than the third best (1qcfA 42% identical and 62% positive with 5 gaps (up to length 4) over 119 residues. 1jpaA is in our template library, but 1mqbA is not. Perhaps we need to add it as MANUAL_TOP_HITS if it doesn't come out in the top few hits by the automatic process. Wed May 17 21:01:40 PDT 2006 Kevin Karplus The t04 alignment has 54 pdb templates in it (fewer than t06, but still a lot). Make started Thu May 18 16:01:02 PDT 2006 Running on orcas.cse.ucsc.edu I killed the jobs on shaw and restarted them on orcas, because shaw seemed to be thrashing in muscle. (I also removed the muscle alignments from the pairwise alignment targets.) Thu May 18 18:50:51 PDT 2006 Kevin Karplus The top blast hits are very strong (79%id): T0291 1jpaA 79.05 296 60 2 2 295 10 305 1e-139 490.7 T0291 1mqbA 72.05 297 82 1 10 305 29 325 6e-126 445.3 T0291 1qcfA 42.96 277 146 6 11 285 174 440 1.6e-60 228.0 T0291 2ptk 44.84 281 143 6 7 285 167 437 2.4e-59 224.2 T0291 1y57A 44.13 281 145 6 7 285 166 436 2.0e-58 221.1 T0291 1fmk 44.13 281 145 6 7 285 166 436 2.0e-58 221.1 T0291 1yolA 44.57 276 141 6 12 285 2 267 4.4e-58 219.9 T0291 1mp8A 39.93 268 158 2 21 288 11 275 7.6e-58 219.2 T0291 1yojA 44.00 275 144 5 12 285 2 267 9.9e-58 218.8 This ordering is somewhat different from the ordering by the HMMs, though 1jpaA comes out high on that list also (4th). Thu May 18 19:24:51 PDT 2006 Kevin Karplus There seems to be something wrong with undertaker---the T0291.undertaker-align.pdb.gz file has only one model, though there are several in the undertaker-align.under script. This is the only target that has shown this behavior, though it is reproducible on both cheep and lopez, even after recompiling part of undertaker. Thu May 18 19:36:27 PDT 2006 Kevin Karplus I found and fixed the bug in undertaker, and killed the try1 that was running. While I'm at it, I might as well add the top BLAST hits as MANUAL_TOP_HITS. Make started Thu May 18 19:43:45 PDT 2006 Running on lopez.cse.ucsc.edu Rerunning the make (which should be mostly no-ops until the undertaker section). Fri May 19 08:53:18 PDT 2006 Kevin Karplus This looks like a 2-domain protein, with a domain break around N110. A lot of the conserved residues are in the cleft between domains, so accurate modeling may be difficult without the ligand, which was not provided (and we couldn't do anything with anyway). Some of the secondary prediction looks a bit off, though the averaged dssp-ehl2 predictions seem better than the str2 predictions. Thu May 25 15:04:06 PDT 2006 Kevin Karplus Have only done fully automatic prediction. Need to do another run with break and soft_clashes turned up. Probably should just use the MANUAL_TOP_HITS, maybe adding in the templates used in try1: 2b7aA, 1xbbA, 1jpaA, 1qpcA We should score the server models. After that we should probably do a polishing run from all existing models. Sat Jun 3 08:43:57 PDT 2006 Kevin Karplus Starting an optimization run (try2 on lopez) using top alignments and a costfcn with fewer constraints (but with clashes and breaks turned up). Also scoring the servr models with the try2 costfcn (if undertaker doesn't crash on them). Sat Jun 3 09:00:10 PDT 2006 Kevin Karplus Partly, but not wholly, because the try2.costfcn uses the sheets from try1-opt2 as contraints, try1-opt2 scores best with that cost fcn. The second best is SAM_T06_server_TS1, which score 11 points worse (3.3 of which is from the constraints). ROBETTA_TS3 is next with fewer clashes and breaks, but a cost of 15 more on constraints. ROBETTA_TS4 scores very well on breaks and clashes, but the constraints add 28 to the cost. There are some servers that do ok on the constraints but score terribly (like MetaTasser, which seems to have grossly overcompacted the structure). A lot of servers are scoring better after SCWRLing---though sometimes with soft_clashes going up, so it may just be a relative weight of sidechain energy and clash avoidance. (Robetta scores get worse after scwrling.) Sat Jun 3 09:28:38 PDT 2006 Kevin Karplus With the unconstrained costfcn, try1-opt2 still scores best and SAM_T06_server_TS1 second, but third is now ROBETTA_TS4, followed by ROBETTA_TS3. After varous Robetta models, the scwrled RAPTOR_ACE models come next. Sat Jun 3 14:08:43 PDT 2006 Kevin Karplus The try2 run scores slightly worse than try1 on the try2 cost function, almost entirely due to constraints, though it also scores worse on the unconstrained costfcn. The main differences are in the N- and C-termini, which we are unlikely to get right. I think it is time for a polishing run, based on the existing models. (Started as try3 on lopez) Sat Jun 3 17:21:47 PDT 2006 Kevin Karplus try3-opt2 scores best on the try3 and try2 costfcn, though try1-opt2 still scores best on the try1 costfcn. Rosetta like repacking try3 the best, though it still gives it positive energy. Sat Jun 3 17:36:54 PDT 2006 Kevin Karplus I'm starting one more polishing run, trying to close the remaining gaps. (try4 on lopez). Sun Jun 4 04:52:04 PDT 2006 Kevin Karplus try4-opt2 now scores the best, and try4-opt2.repack-nonPC scores well with rosetta. I think we've reached the point of diminishing returns here and might as well submit. try4-opt2 try4-opt2.repack-nonPC try1-opt2 undertaker-align 1 1p4oA undertaker-align 2 2evaA Sun Jun 4 05:36:49 PDT 2006 Kevin Karplus I ran into submission problems with the models from alignment, because the duplicated atoms caused undertaker to remove th "wrong" duplicates. # command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 1 WARNING: atoms too close: (T0291)D180.C and (T0291)P181.C only 0.0000000e+00 apart, marking (T0291)P181.C as missing WARNING: atoms too close: (T0291)P181.N and (T0291)E182.N only 0.0000000e+00 apart, marking (T0291)E182.N as missing WARNING: atoms too close: (T0291)P181.CA and (T0291)E182.CA only 0.0000000e+00 apart, marking (T0291)E182.CA as missing removing P181 is ok, but removing E182 is wrong # command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 2 WARNING: atoms too close: (T0291)L46.C and (T0291)K51.C only 0.0000000e+00 apart, marking (T0291)K51.C as missing WARNING: atoms too close: (T0291)K47.N and (T0291)K52.N only 0.0000000e+00 apart, marking (T0291)K52.N as missing WARNING: atoms too close: (T0291)K47.CA and (T0291)K52.CA only 0.0000000e+00 apart, marking (T0291)K52.CA as missing WARNING: atoms too close: (T0291)K62.C and (T0291)Y65.C only 0.0000000e+00 apart, marking (T0291)Y65.C as missing WARNING: atoms too close: (T0291)V63.N and (T0291)T66.N only 0.0000000e+00 apart, marking (T0291)T66.N as missing WARNING: atoms too close: (T0291)V63.CA and (T0291)T66.CA only 0.0000000e+00 apart, marking (T0291)T66.CA as missing WARNING: atoms too close: (T0291)S225.C and (T0291)Y226.C only 0.0000000e+00 apart, marking (T0291)Y226.C as missing WARNING: atoms too close: (T0291)Y226.N and (T0291)G227.N only 0.0000000e+00 apart, marking (T0291)G227.N as missing WARNING: atoms too close: (T0291)Y226.CA and (T0291)G227.CA only 0.0000000e+00 apart, marking (T0291)G227.CA as missing removing K51 is ok, but removing K52 is wrong removing Y65 is ok, but removing T66 is wrong removing Y226 is ok, but removing G227 is wrong Perhaps I should change the heuristic used for deciding which atoms to mark as missing. I could favor removing atoms from residues that are already incomplete. Sun Jun 4 05:51:38 PDT 2006 Kevin Karplus OK, I changed undertaker: # command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 1 WARNING: atoms too close: (T0291)D180.C and (T0291)P181.C only 0.0000000e+00 apart, marking (T0291)P181.C as missing WARNING: atoms too close: (T0291)P181.N and (T0291)E182.N only 0.0000000e+00 apart, marking (T0291)P181.N as missing WARNING: atoms too close: (T0291)P181.CA and (T0291)E182.CA only 0.0000000e+00 apart, marking (T0291)P181.CA as missing # command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 2 WARNING: atoms too close: (T0291)L46.C and (T0291)K51.C only 0.0000000e+00 apart, marking (T0291)K51.C as missing WARNING: atoms too close: (T0291)K47.N and (T0291)K52.N only 0.0000000e+00 apart, marking (T0291)K47.N as missing WARNING: atoms too close: (T0291)K47.CA and (T0291)K52.CA only 0.0000000e+00 apart, marking (T0291)K47.CA as missing WARNING: atoms too close: (T0291)K62.C and (T0291)Y65.C only 0.0000000e+00 apart, marking (T0291)Y65.C as missing WARNING: atoms too close: (T0291)V63.N and (T0291)T66.N only 0.0000000e+00 apart, marking (T0291)V63.N as missing WARNING: atoms too close: (T0291)V63.CA and (T0291)T66.CA only 0.0000000e+00 apart, marking (T0291)V63.CA as missing WARNING: atoms too close: (T0291)S225.C and (T0291)Y226.C only 0.0000000e+00 apart, marking (T0291)Y226.C as missing WARNING: atoms too close: (T0291)Y226.N and (T0291)G227.N only 0.0000000e+00 apart, marking (T0291)Y226.N as missing WARNING: atoms too close: (T0291)Y226.CA and (T0291)G227.CA only 0.0000000e+00 apart, marking (T0291)Y226.CA as missing I have resubmitted the models. Wed Jun 14 10:08:42 PDT 2006 Kevin Karplus Solution released as 2gsfA. Wed Jun 14 17:04:13 PDT 2006 Kevin Karplus Our model1 does not do too bad on this one, though CIRCLE_TS1, FAM_TS5, and RAPTOR-ACE_TS1 beat it. (That's with my evaluation, which is weighted on contacts and rmsd---with GDT our model looks much worse: 83.1% vs. Zhang-Server_TS1 at 91.7%.) With all-atom RMSD, the order is CIRCLE_TS1, FAMS_TS5, try3-opt1...model1 How does Zhang-Server_TS1 get such a good GDT with such a crummy rmsd? They really messed up the C-terminal end, but may have done better on one of the internal loops. The most difficult loop turned out to be disordered. Fri Jul 14 11:18:48 PDT 2006 Kevin Karplus On the improved evaluation in evaluate.unconstrained.pretty, the best model is now CIRCLE_TS1=FAM_TS5 (-1.4504) Of the TS1 models, SAM_T06_server is 20 out of 53 (-0.6726, though scwrling it would have helped a tiny bit -0.6731) Our best model is try3-opt2.gromacs0 (-1.2665) Our best submitted is model1 (-1.2431) Our server did rather poorly on this target---why?