Mon May 5 11:07:00 PDT 2008 T0387 Make started Mon May 5 11:07:33 PDT 2008 Running on cheep.cse.ucsc.edu Make started Mon May 5 11:14:15 PDT 2008 Running on cheep.cse.ucsc.edu Make started Mon May 5 11:20:07 PDT 2008 Running on cheep.cse.ucsc.edu Mon May 5 11:21:26 PDT 2008 Kevin Karplus The first two runs failed, because the CASP organizers hid the sequences behind a database interface, and I had to manually cut and paste the sequences to get it into the a2m file. I sent them a query about their having set up a harder-to-use interface than at CASP7, but I may be forced to do manual cut and paste all summer, since they don't even seem to have a way to retrieve sequences given the target number! Despite their claims that human predictions would primarily be for low-homology targets, this first target has an almost perfect match in the database (2eejA has 84 identical residues and the target has only 91). This is clearly a "how well can you polish this" problem, The target is a tetramer at pH=6.2 (which may affect His protonation). Ah, 2eejA is an NMR structure, so this is an NMR-to-cyrstal conversion problem. There are floppy ends on the NMR structure of 2eejA (before P8 and after C87). The two CYS in 2eejA do not form a disulfide, nor do they cluster with the sole HIS, so there seems to be no reason to use maybe_metal or maybe_ssbond with this target. We may want to use as templates other models (not just the first one in the 2eejA file), to sample the floppy ends better. Looking up the top hits in the T0387.pdb_blast.txt file at http://www.ebi.ac.uk/msd-srv/pqs/ I am seeing mainly monomeric proteins (with a few "no hits:, which could mean NMR models). So the tricky thing may be to figure out the tetramer. Is is a pair of dimers? a ring? Mon May 5 11:44:43 PDT 2008 Kevin Karplus The t06 alignment found over 9000 similar sequences in nr. This appears to be a PDZ domain from PDZ-domain-containing protein 1, as is 2eejA. The highly conserved residues of the domain are G17 G LIVF20 L VILA31 I LIVM45 L DES49 D ILV51 I LIMVF71 I LIMFV80 L The closest hit in SCOP seems to be 1i92A, the first PDZ domain of Na+/H+ exchanger regulatory factor, NHERF. Pfam has this to say about PDZ domains: Literature references 1. Doyle DA, Lee A, Lewis J, Kim E, Sheng M, MacKinnon R; , Cell. 1996;85:1067-1076.: Crystal structures of a complexed and peptide-free membrane protein-binding domain: molecular basis of peptide recognition by PDZ. PUBMED:8674113 2. Ponting CP, Phillips C, Davies KE, Blake DJ , Bioessays 1997;19:469-479.: PDZ domains: targeting signalling molecules to sub-membranous sites. PUBMED:9204764 3. Ponting CP; , Protein Sci 1997;6:464-468.: Evidence for PDZ domains in bacteria, yeast, and plants. PUBMED:9041651 Interpro entry IPR001478 PDZ domains are found in diverse signaling proteins in bacteria, yeasts, plants, insects and vertebrates PUBMED:9041651, PUBMED:9204764. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences PUBMED:9204764. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (betaA to betaF) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an antiparallel beta-strand interacts with the betaB strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the betaA and betaB strands. So one question that immediately springs to mind is whether the PDZ domains are forming the tetramer by binding the C-terminal ends of the other monomers. PQS does have some dimeric and tetrameric proteins that come up when searching for PDZ: 1obx and 1oby tetrameric (actually, they are dimeric, with separate peptides bound) pdb num SpGrp delta num num num num num num percent delta type id biol name ASA S-S SaltB buried chain resid hetatm ASA sole 2g2l_1 2 P 2988.5 0 0 0 2 94 0 50.1 78.7 DIMERIC 1g9o_0 1 P3221 1205.4 0 0 0 2 182 0 19.6 -0.4 DIMERIC 2i04_2 2 P 646.4 0 0 0 2 92 5 17.9 -4.0 DIMERIC 2i04_1 2 P 612.4 0 1 0 2 90 5 17.3 -4.2 DIMERIC 1q3p_2 2 P41212 524.8 0 2 0 2 109 0 13.9 0.6 DIMERIC 1q3o_0 1 P1211 510.6 0 0 0 2 208 5 8.0 -8.0 DIMERIC 2i0i_3 3 C 507.9 0 0 0 2 87 0 14.5 -3.7 DIMERIC 1q3p_1 2 P41212 504.8 0 2 0 2 104 0 14.3 -0.5 DIMERIC 2i0i_2 3 C 487.0 0 0 0 2 87 0 14.2 -3.9 DIMERIC 2i0i_1 3 C 424.2 0 0 0 2 87 0 12.7 -3.2 DIMERIC 1ihj_2 2 P1 332.6 0 0 0 2 99 0 10.3 -4.9 DIMERIC 2i0l_1 2 C 313.2 0 0 0 2 166 0 5.1 1.2 DIMERIC 1ihj_1 2 P1 308.2 0 0 0 2 100 0 9.5 -5.8 DIMERIC 2i0l_2 2 C 15.5 0 0 0 2 11 0 1.3 -1.1 DIMERIC 2g2l_2 2 P 0.0 0 0 0 2 0 0 0.0 0.0 DIMERIC The 1g9o dimer does have the C-termini bound by the other domain, not not making the full strand of the sheet that PDZ-bound peptides usually form, though the carboxyl terminus does form H-bonds to 2 backbone N atoms in the binding pocket. 2i0i has 3 monomers, reported as dimers by PQS because of the bound peptides. Mon May 5 12:41:02 PDT 2008 Kevin Karplus The t2k iterated search finds a quite different signal than the t04 and t06 searches, though it is based on just over 6000 sequences: VIL31 I VILPA34 V AGS40 A LIVM45 L DE49 D ILV51 I VILA54 V NDG55 N Even this model has drifted a bit from the target, with 1g9oA scoring best, and 2eejA in 13th place. All the top hits are, of course, the PDZ domain family, so are structurally quite close. Since 2eejA is not in the template library, we may need to add it manually to the set of alignments to consider. Mon May 5 14:01:14 PDT 2008 Kevin Karplus try1-opt3 folds the C-terminal tail back into the protein, which is most likely wrong---we probably have the crystal formed by a domain-swapped dimer. I'm adding a "MANUAL_TOP_HITS" definition to the Makefile, so that I can make alignments to all the top hits: MANUAL_TOP_HITS:= 2eejA 2ocsA 1i92A 1gq4A 1g9oA 1gq5A For try2, I'll do another monomer optimization, but after that I'll have to figure out how to do dimer optimization. Mon May 5 18:56:21 PDT 2008 Kevin Karplus I set up a bunch of stuff in the Make.main file for creating dimers a little more easily than in casp7, and have started a dimer run in dimer/ I added ConstraintSet dimer_pair Hbond V91.O G108.N Hbond V182.O G17.N to the costfcn to try to get the C-termini into the proper binding pocket in the other monomer. I'm pretty sure that G17 is one of the two N-atoms that Hbonds to the carboxyl terminus. Mon May 5 20:50:38 PDT 2008 Kevin Karplus I was just checking what the servers did: SAM_T02: alignments to 1g9oA, 2ozfA, 2he4A, 2ocsA, 1tp5A SAM_T06: undertaker crashed, so just models from alignments to 1g9oA, 2ozfA, 2he4A, 2he2A SAM_T08: submitted try1-opt3, try1-init, and models from alignments to 1g9oA, 2ozfA, 1tp5A Tue May 6 10:17:02 PDT 2008 Kevin Karplus I think that the try2 run did not include the 2eejA alignments, so I'm trying again with try3. I also two constraints on the final residue, to make the tail stick out more or less where the tail does on 1g9oA, so that dimerization might work better. I'll use the try3 costfcn for the initial selection (with clashes scaled down), instead of MQA_init.costfcn I might want to see if I can add an OXT atom to the end of the chain, so that I can have both Hbonds in the dimer, but that won't be in try3. Tue May 6 10:27:04 PDT 2008 Kevin Karplus The dimer in dimer/decoys/T0387.try2-opt3.pdb.gz has some bad breaks, but the second copy of the dimer looks pretty good. I might want to take those two monomers and put them into a polishing run for the monomers, before building another dimer. Tue May 6 12:59:13 PDT 2008 Kevin Karplus The try3-opt3 monomer scores a bit better than the monomers from the dimer: decoys/dimer-try2-opt2-A.pdb.gz and decoys/dimer-try2-opt2-B.pdb.gz For try4, I'll run the same script as try3, but I'll try starting from a blank pdb file that has an OXT atom on the end, to see if that will give me monomers with OXT. If it does, I'll make the dimer from there. Tue May 6 13:14:42 PDT 2008 Kevin Karplus That did not work---I got an assertion failure trying to read in the file: # ReadTargetPDB reading from PDB file T0387.plusOXT.pdb looking for model 1 undertaker: Conformation.cc:399: virtual void Conformation::append_fragment(int, const ChainsResiduesAndAtoms*, int, int): Assertion `Master->atom(splice2_N_atom-1).no_wc_match( PDBAtomAlphabet->to_base("C"))' failed. Warning: all-zero PDB file read in ReadTargetPDB, so making up random conformation Maybe I should just make a dimer of the try3-opt3 monomer and not worry about OXT for now. Tue May 6 15:42:21 PDT 2008 Kevin Karplus After a little less than an hour, the dimer/try3 run failed with an assertion failure. I think I've fixed the bug (trying to close the KnownBreak), and am trying again. The dimer/try3-opt2 file (now dimer/decoys/T0387.try3-opt2-run1.pdb.gz) looks fairly good--the C-terminus does neatly fit into the binding pocket of the other monomer. Tue May 6 17:58:55 PDT 2008 Kevin Karplus The dimer/try3-opt3 file looks pretty good. Perhaps I should do a polishing run to try to pack things a little tighter, then split up the best dimers into monomers. Tue May 6 19:47:11 PDT 2008 Kevin Karplus The dimer/try4-opt3 run looks pretty good, though the clashes are higher than I would like. For try4, I'll correctly run the OptConform with "multimer 2" (which I had forgotten about) and make T0387.mult4 instead of T0387.try4, so that gromacs will be run correctly on the unpacked dimers. Tue May 6 19:52:11 PDT 2008 Kevin Karplus Oops, but starting T0387.mult1 in dimer, I accidentally stepped on dimer/try1.costfcn I killed the run before any further harm was done, and mult2, mult3, and mult4 ran without trouble. Tue May 6 20:14:37 PDT 2008 Kevin Karplus The mult5 run failed with an assertion failure in undertaker after quite a while. I'll have to set the seed and rerun it under the debugger. Tue May 6 20:59:36 PDT 2008 Kevin Karplus Even with the seed set it didn't crash under the debugger. I hat intermittent faults!! I think I'll just make some minor mods to the dimer/try5.under file and rerun, hoping not to crash. Tue May 6 21:26:14 PDT 2008 Kevin Karplus Without the debugger, it crashed somewhat later at a different assertion. This is getting irritating! I don't think that it was even doing much to improve the dimer. Wed May 7 08:43:25 PDT 2008 Kevin Karplus I wonder if I should try an additional model: a cyclic tetramer. It seems less likely than a dimer, but the packing of the dimer with a lot of negative charges clustered in the dimer interface seems unlikely. The clashes between the D47 residues is particularly bad. Opening it up to a cyclic tetramer might relieve the clashes. I don't have a tetramer to work from, and I wonder if I can create one with just undertaker. If not, I'll have to try using Proteinshop. Wed May 7 09:19:47 PDT 2008 Kevin Karplus I tried making a stupid tetramer (putting two copies of the dimer in exactly the same place), to see if undertaker could turn it into a cyclic tetramer, using TweakMultimer. (Maybe OptSubtree would be better?) Wed May 7 09:32:17 PDT 2008 Kevin Karplus That didn't work: the duplicated atoms were marked as missing, and undertaker can't optimize an incomplete conformation, so it crashed for having no conformations! Maybe I should up InsertAlignment and start from a random conformation. Wed May 7 12:00:55 PDT 2008 Kevin Karplus The tetramer/try1 run does seem to form a tetramer that has the right conformation for the core and satisfies the constraints I gave it for the C-terminal docking, but I might want to add more constraints, to try to get the C-terminus to approach the normal PDZ binding as an extra strand of the sheet. There does not seem to be the buried-charge problem of the dimer. Wed May 7 12:29:59 PDT 2008 Kevin Karplus tetramer/try2 will attempt to form a tetramer with more normal binding of the C-terminus into the binding pocket. Wed May 7 13:55:34 PDT 2008 Kevin Karplus tetramer/try2 again does a decent job of getting the constraints I specified, but not quite with a believable tetrameric structure. Maybe I should increase the strand constraints for a full 6 C-terminal residues. Having more conformations in the initial multimerization pass might also help. Wed May 7 18:39:57 PDT 2008 Kevin Karplus tetramer/try3 does a little better, but the multimers are too spread out and there is a bend around Q88. If we could make the tail straighter, we could probably get a tighter packing. Maybe I should add some constraints between R115 and E46, D47, or E48, to try to pull things together. Thu May 8 08:59:08 PDT 2008 Kevin Karplus tetramer/try4 is a complete mess, with bad conflicts between the monomers. The E46-E48 contacts with R115 are made, but the originally desired ones between V91 and F109 are not. This is where I'd really like a manipulable model, so that I could put the monomers roughly where I think they ought to go, and tweak the C-terminal strand to fit. Sat May 10 08:20:29 PDT 2008 Kevin Karplus I picked up the server tarball, and socred everything with the MQA_init costfcn. The dimer/try5-opt1 model scores best, followed by try1-opt3. The best-scoring external one is HHpred5_TS1 (not surprising for a close homology model). I should try building dimers and tetramers off of that model. Mon May 12 10:25:20 PDT 2008 Kevin Karplus I made some changes to the Make.main file, so that we can make undertaker scripts to read the top 10 models from the MQA evaluations. I've started a run from the top 10 MQAC models, using the try3 costfcn. Mon May 12 11:59:50 PDT 2008 Kevin Karplus The MQAC-try1 run seems to have favored Pcons_multi_TS3, which actually comes out quite close to try1-opt3 and to Zhang-Server_T3 and our first alignment (to 1g9oA). All the core residues are essentially in the same places in all these models---even the sidechains superimpose very well. Mon May 12 12:04:49 PDT 2008 Kevin Karplus I'll do another run from the top 10 MQAU-chosen models, again with try3 as the costfcn. I've modified the metaserve-MQAx.under scripts to skip the TryAllAlign stuff at the beginning---these are highly polished models already, and adding alignments is not likely to help. After that, I should try polishing from the full set and making a dimer again. Making the right tetramer is probably more important, and more difficult. Mon May 12 12:30:40 PDT 2008 Kevin Karplus The MQAU run seems to favor 3D-JIGSAW_AEP_TS1 Mon May 12 12:46:48 PDT 2008 John Archie The QA files have been submitted. Mon May 12 12:50:50 PDT 2008 Kevin Karplus The core of all our predictions (and a lot of the server predictions) is the same, superimposed to half an Angstrom or less. The C-terminal tail is what varies the most, and that is almost certainly determined by the multimerization. The tail for MQAU1 is different from other models, even different from the 3D-JIGSAW_AEP_TS1 that the run favored at the beginning. It makes some Hbonds at the end , and looks like a reasonable monomeric solution, but I don't think it will multimerize well. I have to figure out how to get a good tetramer still. Fri May 23 14:42:44 PDT 2008 Kevin Karplus We don't seem to have a working version of ProteinShop, and Baker has not released a version of FoldIt that we can use, so I'll have to get the effect I want with undertaker, which may be difficult. I could try breaking off the C-terminal peptide and docking it with undertaker, but how do I then convert that into the tetramer? Fri May 23 15:00:04 PDT 2008 Kevin Karplus For try5, I broke off the C-terminal end, and am trying to dock it into the normal PDZ binding site. If that works, I'll try to find a way to make a tetramer out of the two pieces, perhaps by superimposing one contiguous model on the first 83 residues and another model on the remaining residues, then symmetrizing. Fri May 23 15:53:41 PDT 2008 Kevin Karplus RATS! try5 did not put the C-terminal peptide where I wanted. It seems that it moved the gap away from before G84 by inserting fragments, so that OptSegment never got a chance to fix it. I'll try again with ONLY the segment operations for the opt1 part (try6). Fri May 23 17:19:24 PDT 2008 Kevin Karplus Nope---that doesn't do it. It looks like the Opt operations put it back together----they must not check KnownBreak. Fri May 23 18:16:11 PDT 2008 Kevin Karplus I tried fixing just the OptSubtree and OptSegment operators, and it didn't help either. So I made sure that only constraints from the costfcn were included in all the operations that chose constraints, and remved all constraints except the final_tail constraints from the initial run for try8. Fri May 23 21:01:43 PDT 2008 Kevin Karplus try8 managed to make one of the Hbonds, but never placed the strand correctly. I don't know if this is a bug in OptSegment, or just a bad costfcn (perhaps with too high a clash penalty). For try9, I'm trying again with lower clash penalty and with "ReportCost try9.rdb" so that I can see what costs are being generated. Fri May 23 21:21:22 PDT 2008 Kevin Karplus Oops---ReportCost only affects the CostConform commands, not the OptConform commands. I need to add report_all_costs to the OptConform command as well! Sat May 24 05:38:53 PDT 2008 Kevin Karplus On try9, looking at how the final_tail cost compares to other costs (in gnuplot, using commands like plot '< smooth-rdb -name1 final_tail -name2 soft_clashes < try9.rdb' with lines ) I can see that the there were a few models built in which the final_tail values got down to low costs, but that they had bad clashes and bad dry5, dry6.5, and dry8 values (some other costs also got bad). The combined effect was that an improvement of 0.5 in final_tail incurred a total cost increase of about 100, so we'd need to add about 200 to the weight of final_tail to save these models. Let me try that for try10. Sat May 24 06:21:51 PDT 2008 Kevin Karplus It looks like (with sufficient weight) we can force the final_tail constraints, but with terrible clashes and loss of H-bonds. So the problem is not an algorithmic one, but just that getting a good fit is difficult. Sat May 24 10:15:28 PDT 2008 Kevin Karplus The bent C-terminal tail in try10 does *not* form the desired sheet, though it does make the V91.O Hbonds. Instead it sticks into the main domain on the wrong side of the strand and disrupts everything, Perhaps what I need to do is to chop the tail up more so that it can be reassembled more readily, and allow fragment insertions earlier. I could probably also reduce the weight of final_tail a bit, so that it is not quite so insistent on it at the expense of everything else, and reduce the V91.O hbonds relative to the strand. Adding a constraint that V91-K86 is around 15.9 Angstroms CA-CA might help also. Sat May 24 12:26:38 PDT 2008 Kevin Karplus try11-opt3 is a little closer to what I want, but still pretty messed up. undertaker does not seem to be a good tool for this sort of docking! Part of the problem THIS time is that I allowed fragment and alignment insertion early, so that the known break at G84 went away, and undertaker was then trying to keep the break before A87 small. Let me try once more, with the fragment and alignment insertion operators initially turned off. Sat May 24 17:33:09 PDT 2008 Kevin Karplus Nope, try12 has the same trouble. I think I'll probably give up on this approach. The question now is whether I can get a tetramer, or if I should just report the best monomers using the try4 costfcn, which I'll recompute after changing the costfcn-init.under script to use stricter clash definitions. T0387.MQAU1-opt3.pdb -30.0 -14.0 -1.9 -1.5 -37.4 -15.0 -2.4 -3.5 -4.7 8.4 48.2 46.9 132.8 141.2 -2.3 -55.3 2.3 0.9 7.3 0.1 0.0 -5.2 -3.8 -4.3 -5.7 201.03 T0387.try3-opt3.pdb.gz -30.0 -14.8 -2.0 -1.7 -37.5 -14.0 -2.5 -3.4 -4.5 8.0 48.7 49.0 138.0 145.8 -2.2 -55.5 3.0 0.6 6.2 0.3 1.0 -4.9 -3.9 -4.5 -5.9 213.39 T0387.MQAC1-opt3.pdb -29.9 -15.0 -1.8 -1.5 -38.1 -14.8 -2.6 -3.4 -4.5 8.2 49.6 50.0 139.0 147.2 -2.1 -55.3 2.7 1.1 4.4 0.3 0.7 -5.0 -3.7 -4.5 -5.4 215.43 T0387.try8-opt1.pdb.gz -19.0 -14.0 -1.8 -1.0 -36.4 -14.7 -2.6 -3.0 -4.5 8.1 50.1 49.6 137.1 146.2 -2.0 -54.4 2.7 1.3 6.6 0.4 7.7 -5.0 -3.7 -4.4 -5.7 237.43 Sat May 24 20:16:57 PDT 2008 Kevin Karplus I gave up on this model and submitted Model 1 T0387.MQAU1-opt3.pdb a metaserver model 2 T0387.try3-opt3.pdb a native SAM/undertaker model 3 T0387.MQAC1-opt3.pdb a metaserver model (from consensus scoring) 4 alignment T0387-1g9oA-t06-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m 5 alignment T0387-2ozfA-t06-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m Sun Jun 8 15:52:03 PDT 2008 Kevin Karplus I just noticed today that the KnownBreak commands in the dimer/costfcn-init.under and tetramer/costfcn-init.under files were wrong---they had bare numbers, which are interpreted as atom numbers rather than as residue numbers. (I should probably fix this in undertaker!) Make started Thu Feb 12 13:06:54 PST 2009 Running on peep.cse.ucsc.edu