Fri Jul 11 09:14:27 PDT 2008 T0500 Thu Jul 17 17:18:41 PDT 2008 SAM-T08-MQAO hand QA T0500 Submitted Thu Jul 17 17:18:41 PDT 2008 SAM-T08-MQAU hand QA T0500 Submitted Thu Jul 17 17:18:41 PDT 2008 SAM-T08-MQAC hand QA T0500 Submitted Make started Mon Jul 21 12:08:33 PDT 2008 Running on cheep.cse.ucsc.edu I just noticed that this is a HUMAN+SERVER target, and I hadn't even started running it! It is a HUGE protein, and they're claiming to have solved it with NMR. How?? Mon Jul 21 17:29:21 PDT 2008 Kevin Karplus There is a small domain (around 80 residues) that is getting good hits to a.60.1.2. I wonder if that is all they solved. Tue Jul 22 10:06:12 PDT 2008 Kevin Karplus The MQAU and MQAC quality assessments both put SAM-T08-server_TS1 first, but with very low GDT predicted. I should probably do an MQAY1 metaserver run that excludes the SAM servers. Tue Jul 22 10:12:25 PDT 2008 Kevin Karplus I was thinking of trying to cut out the one good hit to a.60.1.2 domains, but even the 1v38A alignments can't agree on where it aligns. Tue Jul 22 10:51:28 PDT 2008 Kevin Karplus There are not a lot of sequences in the multiple alignments (50-69), but the conservations signals are pretty strong. The t2k alignment has better signals at the N-terminus, but for P485-P590, the t06 alignment is stronger. There is not a lot of regular secondary structure predicted--mainly a few helices at the N and C termini and a few scattered strands. I'm wodering whether this target is worth spending time on---I'm dubious that they'll be able to solve an 829-long protein by NMR. There are 4 NMR files over 400 residues in PDB: 1y8bA (731 residues = 1p7tA) 2bruA (401 residues=1x13A) 2jqxA (723 residues=1d8cA) 2vdaA (828 residues) Since 2vdaA has no corresponding X-ray structure, there must be some labs that can handle that huge an NMR stucture. It probably helps that there are 3 very separate domains in that structure, which may be more or less independently solvable. I wonder how the relationship between the domains was determined. In any case, I think I'll need to split this protein into domains in oder to do any prediction. According to Swissprot: http://www.expasy.org/cgi-bin/niceprot.pl?Q8WXD9 * FUNCTION: May link the scaffolding protein CASK to downstream intracellular effectors (By similarity). * SUBUNIT: Binds the CaM kinase domain of CASK. Forms a ternary complex with CASK and LIN7A, LIN7B or LIN7C. Competes with APBA1 that forms a similar complex with CASK and LIN7 proteins. The tripartite complex CASKIN1/CASK/LIN7(A/B/C) binds the cytoplasmic tail of NRXN1 (By similarity). * SUBCELLULAR LOCATION: Cytoplasm (By similarity). * SIMILARITY: Contains 6 ANK repeats. * SIMILARITY: Contains 2 SAM (sterile alpha motif) domains. * SIMILARITY: Contains 1 SH3 domain. Key From To Length Description CHAIN 1 1431 1431 Caskin-1. REPEAT 48 77 30 ANK 1. REPEAT 81 110 30 ANK 2. REPEAT 114 143 30 ANK 3. REPEAT 147 176 30 ANK 4. REPEAT 188 217 30 ANK 5. REPEAT 220 249 30 ANK 6. DOMAIN 281 347 67 SH3. DOMAIN 472 535 64 SAM 1. DOMAIN 541 605 65 SAM 2. REGION 375 469 95 CASK-binding (By similarity). COMPBIAS 714 1365 652 Pro-rich. MOD_RES 253 253 Phosphotyrosine (By similarity). MOD_RES 646 646 Phosphoserine (By similarity). MOD_RES 775 775 Phosphothreonine (By similarity). MOD_RES 787 787 Phosphoserine (By similarity). MOD_RES 1065 1065 Phosphothreonine (By similarity). MOD_RES 1067 1067 Phosphoserine (By similarity). MOD_RES 1257 1257 Phosphoserine (By similarity). MOD_RES 1364 1364 Phosphoserine (By similarity). CONFLICT 278 278 R -> P (in Ref. 2;BAA92544). The protein chain that we have starts at A603 of the full protein, so all the known, labeled domains above are not part of what we have to predict. All we have for this region is that it is proline rich and that there are a number of residues that may get phosphorylated. I checked the homologs found by T06 that had swissprot entries (mouse, rat, frog, human caskin-2), but none of them had any more information on this region. I'll try doing "subdomain" predictions that are 200 long, starting every 100 residues. Created A1-A200, but no make started Created M101-A300, but no make started Created K201-K400, but no make started Created A301-G500, but no make started Created S401-P600, but no make started Created P501-S700, but no make started Created T601-S800, but no make started Created P701-E829, but no make started Tue Jul 22 11:36:27 PDT 2008 Kevin Karplus I started a make on the moai cluster for each of the 7 "subdomains", but I'm not hopeful of finding anything, other than perhaps where the a.60.1.2 hits are coming from. Tue Jul 22 21:37:26 PDT 2008 Kevin Karplus As expected, the MQAU1 and MQAC1 runs both polished SAM-T08-server_TS1. The MQAY1 run is optimizing BAKER-ROBETTA_TS4. For the "subdomains", there are no good matches: A1-A200/best-evalue: 20.718 A301-G500/best-evalue: 54.083 K201-K400/best-evalue: 28.496 M101-A300/best-evalue: 15.414 P501-S700/best-evalue: 21.666 P701-E829/best-evalue: 45.187 S401-P600/best-evalue: 41.856 T601-S800/best-evalue: 4.6941 The only a.60 matches are for A301-G500: A301-G500/T0500.best-scores.rdb:1kl9A 182 5.4083e+01 a.60.14.1,b.40.4.5 111574,111575 A301-G500/T0500.best-scores.rdb:1q46A 175 8.5567e+01 a.60.14.1,b.40.4.5 111594,111595 I wonder why they came up as possible hits for the whole chain, but not for any of the pieces, given that the a.60.1.2 chains that hit on the whole T0500 were only 60-80 residues long, and the "subdomains" here were 200 long, starting every 100. Wed Jul 23 16:34:07 PDT 2008 Chirag Sharma A1-A200 T0500.try1-opt3.pdb and T0500.try1-init.pdb (both in best-models.pdb) don't look like proteins. There are predictions but nothing substantial. A301-G500 NOTHING at all is predicted. No chance of finding anything here. K201-K400 T0500.try1-opt3.pdb and T0500.try1-init.pdb have very weak predictions. Two/three coils are predicted but the coils are not connected to anything of substance. M101-A300 T0500.try1-opt3.pdb has a nice pair of sheets (models 1 & 2) but they are not connected but have a strong prediction. It also has another pair of sheets (model 1) but they are not predicted at all. T0500.try1-init.pdb also has nicely predicted sheets(model 5) however they are not folded. They follow one another and look like: --> --> . T0500.undertaker-align.pdb also has a decent helix. This has been the best I have seen so far but is still lacking the real components of a protein. P501-S700 Once again nothing at all is predicted by any method. P701-E829 AGAIN!! Nothing is predicted, although there are very nice helices which pack nicely against each other as well. S401-P600 Nothing predicted, anywhere. T601-S800 Absurd. Nothing predicted again. A nice long helix in each of the models however that are not predicted. Wed Jul 23 17:07:24 PDT 2008 Kevin Karplus The M101-A300/try1-opt3 sheet is for roughly G191-E266, but only L261-C262 are predicted to be sheet---the rest is mostly random coil of one sort or another. The sheet came from model 5 in best-models, which is model 3 of undertaker-align, which is from 2hb0A. The long helix for D765-A794 in T601-S800/try1-opt3 is predicted, and should be in the final model, I think. There is another helix in P701-E829/try1, for rougly D799-E829, which may also be worth keeping. Looking for helices that the subdomains agree on: grep Constraint */decoys/try1-opt3.helices A1-A200/decoys/try1-opt3.helices:HelixConstraint (T0500)S56 (T0500)G65 A1-A200/decoys/try1-opt3.helices:HelixConstraint (T0500)G146 (T0500)A152 A1-A200/decoys/try1-opt3.helices:HelixConstraint (T0500)P186 (T0500)G191 M101-A300/decoys/try1-opt3.helices:HelixConstraint (T0500)P118 (T0500)E129 M101-A300/decoys/try1-opt3.helices:HelixConstraint (T0500)K275 (T0500)S289 K201-K400/decoys/try1-opt3.helices:HelixConstraint (T0500)H227 (T0500)Y232 K201-K400/decoys/try1-opt3.helices:HelixConstraint (T0500)G376 (T0500)A388 A301-G500/decoys/try1-opt3.helices:HelixConstraint (T0500)P467 (T0500)R477 = S401-P600/decoys/try1-opt3.helices:HelixConstraint (T0500)S401 (T0500)S409 S401-P600/decoys/try1-opt3.helices:HelixConstraint (T0500)L442 (T0500)E449 S401-P600/decoys/try1-opt3.helices:HelixConstraint (T0500)P467 (T0500)G478 + S401-P600/decoys/try1-opt3.helices:HelixConstraint (T0500)S519 (T0500)R527 S401-P600/decoys/try1-opt3.helices:HelixConstraint (T0500)A528 (T0500)Q533 P501-S700/decoys/try1-opt3.helices:HelixConstraint (T0500)V658 (T0500)T664 P501-S700/decoys/try1-opt3.helices:HelixConstraint (T0500)V682 (T0500)G687 T601-S800/decoys/try1-opt3.helices:HelixConstraint (T0500)R742 (T0500)A748 = T601-S800/decoys/try1-opt3.helices:HelixConstraint (T0500)D765 (T0500)A794 + P701-E829/decoys/try1-opt3.helices:HelixConstraint (T0500)P740 (T0500)A752 + P701-E829/decoys/try1-opt3.helices:HelixConstraint (T0500)D765 (T0500)E772 = P701-E829/decoys/try1-opt3.helices:HelixConstraint (T0500)C777 (T0500)Q791 = P701-E829/decoys/try1-opt3.helices:HelixConstraint (T0500)D799 (T0500)G814 + P701-E829/decoys/try1-opt3.helices:HelixConstraint (T0500)M816 (T0500)E829 + Looking for sheets that the subdomains agree on: grep Constraint */decoys/try1-opt3.sheets A1-A200/decoys/try1-opt3.sheets:SheetConstraint (T0500)H100 (T0500)M101 (T0500)Q105 (T0500)S104 hbond (T0500)M101 1 A1-A200/decoys/try1-opt3.sheets:SheetConstraint (T0500)S23 (T0500)M27 (T0500)A20 (T0500)L16 hbond (T0500)M27 1 M101-A300/decoys/try1-opt3.sheets:SheetConstraint (T0500)G229 (T0500)P235 (T0500)L263 (T0500)A257 hbond (T0500)F230 1 M101-A300/decoys/try1-opt3.sheets:SheetConstraint (T0500)G229 (T0500)Y232 (T0500)G194 (T0500)G191 hbond (T0500)A231 1 M101-A300/decoys/try1-opt3.sheets:SheetConstraint (T0500)T205 (T0500)L209 (T0500)A200 (T0500)A196 hbond (T0500)L209 1 K201-K400/decoys/try1-opt3.sheets:SheetConstraint (T0500)A306 (T0500)R311 (T0500)R319 (T0500)R314 hbond (T0500)T307 1 K201-K400/decoys/try1-opt3.sheets:SheetConstraint (T0500)P255 (T0500)V258 (T0500)Q309 (T0500)A306 hbond (T0500)T256 1 A301-G500/decoys/try1-opt3.sheets:SheetConstraint (T0500)G412 (T0500)G415 (T0500)S409 (T0500)L406 hbond (T0500)G415 1 A301-G500/decoys/try1-opt3.sheets:SheetConstraint (T0500)L406 (T0500)S409 (T0500)V377 (T0500)L374 hbond (T0500)S409 1 S401-P600/decoys/try1-opt3.sheets:SheetConstraint (T0500)P564 (T0500)H569 (T0500)P579 (T0500)T574 hbond (T0500)L565 1 S401-P600/decoys/try1-opt3.sheets:SheetConstraint (T0500)V546 (T0500)P550 (T0500)S543 (T0500)I539 hbond (T0500)R549 1 S401-P600/decoys/try1-opt3.sheets:SheetConstraint (T0500)V508 (T0500)G510 (T0500)G486 (T0500)F488 hbond (T0500)E509 1 S401-P600/decoys/try1-opt3.sheets:SheetConstraint (T0500)A450 (T0500)G452 (T0500)V458 (T0500)E456 hbond (T0500)A450 1 P501-S700/decoys/try1-opt3.sheets:SheetConstraint (T0500)S655 (T0500)E657 (T0500)G652 (T0500)L650 hbond (T0500)E657 1 P501-S700/decoys/try1-opt3.sheets:SheetConstraint (T0500)L630 (T0500)L637 (T0500)P651 (T0500)T644 hbond (T0500)L630 1 P501-S700/decoys/try1-opt3.sheets:SheetConstraint (T0500)V575 (T0500)R578 (T0500)V516 (T0500)S519 hbond (T0500)R578 1 P501-S700/decoys/try1-opt3.sheets:SheetConstraint (T0500)G573 (T0500)R577 (T0500)Y568 (T0500)P564 hbond (T0500)R577 1 P501-S700/decoys/try1-opt3.sheets:SheetConstraint (T0500)L565 (T0500)V567 (T0500)V508 (T0500)A506 hbond (T0500)V567 1 P501-S700/decoys/try1-opt3.sheets:SheetConstraint (T0500)F538 (T0500)E542 (T0500)A528 (T0500)R524 hbond (T0500)E542 1 P701-E829/decoys/try1-opt3.sheets:SheetConstraint (T0500)G814 (T0500)M816 (T0500)D812 (T0500)G814 hbond (T0500)M816 1 T601-S800/decoys/try1-opt3.sheets:SheetConstraint (T0500)P628 (T0500)Q632 (T0500)K647 (T0500)P643 hbond (T0500)V629 1 I'm not seeing a lot of agreement on the strands. Maybe I should do a domain for P430-P679 to try to get another view on the predicted strands Thu Jul 24 15:20:08 PDT 2008 Kevin Karplus P430-P679 picks up a bit of sheet: try1-opt3: SheetConstraint (T0500)L565 (T0500)H569 (T0500)T541 (T0500)K537 hbond (T0500)H569 1 SheetConstraint (T0500)I526 (T0500)K529 (T0500)T541 (T0500)F538 hbond (T0500)R527 1 SheetConstraint (T0500)Q496 (T0500)G500 (T0500)A518 (T0500)A514 hbond (T0500)R497 1 try1-init: SheetConstraint (T0500)V658 (T0500)G663 (T0500)P651 (T0500)K646 hbond (T0500)G663 1 SheetConstraint (T0500)S566 (T0500)V567 (T0500)V575 (T0500)T574 hbond (T0500)S566 1 SheetConstraint (T0500)V536 (T0500)K537 (T0500)G571 (T0500)N570 hbond (T0500)K537 1 SheetConstraint (T0500)P564 (T0500)N570 (T0500)E542 (T0500)V536 hbond (T0500)H569 1 # SheetConstraint (T0500)I526 (T0500)Q530 (T0500)T541 (T0500)K537 hbond (T0500)R527 1 # SheetConstraint (T0500)A514 (T0500)A520 (T0500)G500 (T0500)G494 hbond (T0500)S519 1 # SheetConstraint (T0500)K448 (T0500)I451 (T0500)V458 (T0500)G455 hbond (T0500)E449 1 Mon Jul 28 14:37:39 PDT 2008 Kevin Karplus Let's try polishing that part up a bit. P430-P679/try2 started to try to polish the sheets found so far. Mon Jul 28 14:49:50 PDT 2008 Kevin Karplus P701-E829/try2 started to try to polish the helices there. Mon Jul 28 20:57:53 PDT 2008 Kevin Karplus The P430-P679/try2 run did not close gaps---they are still horrible. It looks like there is a fairly large sheet trying to form. THis may be worth some more attention. P701-E829/try2 looks ok, but not particularly convincing. Wed Jul 30 15:23:33 PDT 2008 Kevin Karplus Because undertaker does not have the ability to do crossover operations from incomplete models, I have arbitrarily constructed chimeras by inserting each of the subdomains into MQAU1-opt2.gromacs0.repack-nonPC (which scored best with the try2 costfcn). These were crude cut-and-paste operations for the whole subdomain, with no attempt to find good crossover points. For try2, I'll do a short run, with the crossover operators turned way up. Wed Jul 30 17:28:49 PDT 2008 Kevin Karplus It looks like try2 has picked up the C-terminal helices, but not the sheets from P430-P679. Perhaps I should make a try2 chimera with that region and try polishing it, once try2 is finished. I probably also want to put more weight on sheets than helices, since they are harder to form. Wed Jul 30 17:55:45 PDT 2008 Kevin Karplus Rosetta really hates try2-opt3, and gromacs crashes on it. The residues rosetta hates are M27, R18, K19, A28, P717, L55, V438, P137, A231, G70, ... because of very bad clashes. I made a chimera-try2-P433-V575, which copies the subdomain from P430-P679/try2 into try2-opt3.repack-nonPC, but only that part that has sheets. I tried to make the joins with try2 not be too distant. Wed Jul 30 20:47:59 PDT 2008 Kevin Karplus try3-opt3 has the sheets of P430-P679/ and the helices of P701-E829/ It is still super foamy, and Rosetta thinks it has terrible clashes. Thu Jul 31 06:15:03 PDT 2008 Kevin Karplus try4 scores better with pred_nb11_back and pred_CB14_back, but isn't really and more compact (phobic_fit is actually larger). I made a chimera of try4 and try3, copying in G454-V575 from try3-opt3.repack-nonPC, and will try optimizing that for try5, with the emphasis on trying to pack things. I don't expect much improvement, though. This target is too big and with too little predicted secondary structure to be easy to work with. Thu Jul 31 07:26:54 PDT 2008 Kevin Karplus try5 seems to be working mainly with try4-opt2, rather than chimera-try4-try3, so I'm starting try6 with the same costfcn, but with only chimera-try4-try3 as a starting model. Thu Jul 31 08:19:36 PDT 2008 Kevin Karplus try5 crashed with undertaker: ../ultimate/src/Transform/Transform.h:83: bool Transform::OK() const: Assertion `1.-ERR_LIMIT < rot_mag2 && rot_mag2 < 1.+ERR_LIMIT' failed. shortly after creating try5-opt2 try5-opt2 does better on several measures, but worse on clashes and breaks than try4-opt3.gromacs0.repack-nonPC I'll try making the gromacs and rosetta-repacked versions of try5-opt2, and see how they do. Thu Jul 31 08:59:51 PDT 2008 Kevin Karplus All my models are extremely loose. Maybe I should break try5 or try6 up into pieces and superimpose them on a more compact model, such as MQAY1. Possible breakpoints are P45, P155, P225, P271, P421, P453, P579, P612, P665. Thu Jul 31 11:00:55 PDT 2008 Kevin Karplus I broke try6-opt3.gromacs0.repack-nonPC into fragments and superimposed them on MQAY1-opt3.gromacs0.repack-nonPC. I should use MQAY1 for P155-P225, P225-P271, and P421-P453, but the fragments for the rest. This will have terrible breaks and need to be reoptimized. Thu Jul 31 11:24:51 PDT 2008 Kevin Karplus try7 started to try to close gaps in chimera-MQAY1-try6 Thu Jul 31 14:45:18 PDT 2008 Kevin Karplus try7 still has bad clashes and breaks, but is more compact than try6. Let me try polishing it some more, with clashes and breaks turned up higher. Thu Jul 31 15:24:32 PDT 2008 Kevin Karplus Foo! undertaker: ../ultimate/src/Transform/Transform.h:83: bool Transform::OK() const: Assertion `1.-ERR_LIMIT < rot_mag2 && rot_mag2 < 1.+ERR_LIMIT' failed. before even doing any optimization! I'll try running try8 again, and hope it doesn't crash again. Thu Jul 31 15:53:02 PDT 2008 Kevin Karplus Nope. It fails again in the same place. I don't have enough time to debug this now, so I think I'll give up on T0500, and just submit the junk I have. Thu Jul 31 16:21:59 PDT 2008 Kevin Karplus Submitted with comment: T0500 seemed pretty hopeless to me. There was almost no secondary structure predicted, and even breaking the protein up into overlapping 200-long segments did not result in finding templates or consistent secondary structure. I did not want to spend a lot of effort on this target, since there is significant question whether NMR can handle such a big protein. There are only 4 NMR structures in PDB with chains over 400 residues, and the longest is 2vdaA at 828 residues, so T0500 would represent a new high in NMR structure determination, if it can be solved at all. I am submitting 5 rather junky models: Model 1 T0500.try6-opt3.gromacs0.pdb # < chimera-try4-try3 chimera-try4-try3: mostly from T0500.try4-opt3.gromacs0.repack-nonPC.pdb G454-V575 from T0500.try3-opt3.repack-nonPC.pdb try4-opt3 < try3-opt1 < chimera-try2-P433-V575 try3-opt3 < chimera-try2-P433-V575 + chimera-MQAU1-T601-S800 chimera-try2-P433-V575: mostly T0500.try2-opt3.repack-nonPC.pdb P433-V575 from P430-P679/try2-opt3 try2-opt3 < chimera-MQAU1-S401-P600 +MQAC1-opt3.gromacs0.repack-nonPC chimera-MQAU1-T601-S800: mostly T0500.MQAU1-opt3.gromacs0.repack-nonPC.pdb T601-S800 from T601-S800/try1-opt3 chimera-MQAU1-S401-P600: mostly T0500.MQAU1-opt3.gromacs0.repack-nonPC.pdb S401-P600 from S401-P600/try1-opt3 MQAU1-opt3 < SAM-T08-server_TS1 2 T0500.MQAY1-opt3.gromacs0.repack-nonPC.pdb # < BAKER-ROBETTA_TS4 # best Rosetta energy 3 T0500.MQAC1-opt3.gromacs0.repack-nonPC.pdb # < SAM-T08-server_TS1 4 T0500.try2-opt3.repack-nonPC.pdb # < chimera-MQAU1-S401-P600 +MQAC1-opt3.gromacs0.repack-nonPC This run had high crossover between many models, mostly MQAU1-opt3.gromacs0.repack-nonPC, with each 200-long segment starting every 100 residues replaced by an independently predicted segment. 5 T0500.try7-opt3.gromacs0.repack-nonPC.pdb < chimera-MQAY1-try6 chimera-MQAY1-try6: overall structure from T0500.MQAY1-opt3.gromacs0.repack-nonPC.pdb also P155-P271, P421-P453 rest from try6-opt3.gromacs0.repack-nonPC, as fragments of various lengths superimposed on MQAY1