Tue Jul 8 10:15:14 PDT 2008 T0489 Make started Tue Jul 8 10:15:47 PDT 2008 Running on cheep.cse.ucsc.edu Tue Jul 8 10:57:52 PDT 2008 Kevin Karplus Based on the name of the protein, I should look for a metal-binding site. PDB blast finds no close hits. HMMs get weak hits to 1yqyA, 1j7nA, and 1pwuA (all with same domains). Tue Jul 8 17:18:00 PDT 2008 Kevin Karplus The conserved charged residues are H149,E150,H153,E212,E219 These cluster together, along with Y205 and A215. Other conserved residues in the neighborhood, but not pointing towards the putative active site are V128, N145, C213, and F222. Cys residues do not cluster and do not seem to be part of the active site. I may want to put constraints on H149,E150,H153,E212,E219, and Y205, to retain the shape of the 1yqyA site, but I won't do it for try2, which will mainly be trying to close gaps. Tue Jul 8 17:27:26 PDT 2008 Kevin Karplus try2 started on lopez. I may also want to do another run from alignments, with the try2 costfcn and the extra alignments that I picked up by looking for other examples of the superfamilies in the HMM hits. Tue Jul 8 17:29:11 PDT 2008 Kevin Karplus The superfamily d.92.1.* is # Metalloproteases ("zincins"), catalytic domain d.92.1.14 is # Family: Anthrax toxin lethal factor, N- and C-terminal domains The superfamily d.166.1.* is ADP-ribosylation, but does not really line up with the target in the alignments, so I should omit the templates 1f0lA and 1dtpA, which only match d.166.1, not d.92.1 The most frequently occuring hits that include d.92.1.* are 55 1yqyA 23 1j7nA 18 1c7kA 17 1bqbA 15 1satA 14 1mmqA 14 1iabA 12 1i1iP 10 1rm8A 10 1kapP One that scores well but isn't in the template library is 1pwuA. One from the template library that scores well is 1y93A, though it doesn't score well with most of the target HMMs. I'll exclude 1hv5A (which was in the try2 set), because it is only found by the t2k.w0.5 HMM, and then not strongly---I'm not likely to get good alignments to it. The template libraries all seem to like a.123.1.1 domains: Nuclear receptor ligand-binding domain, but there are huge gaps even in the best alignments, so this seems like an unlikely hit. Tue Jul 8 18:03:46 PDT 2008 Kevin Karplus try3 started on moai cluster, to try to optimize same costfcn as try2, but starting from alignments to d.92.1.* templates. Tue Jul 8 23:19:31 PDT 2008 Kevin Karplus try3 gets a rather different solution for residues M1-F88, with an extra N-terminal strand on the sheet. This one may justify a little more optimization. Sat Jul 12 09:34:14 PDT 2008 Kevin Karplus The MQAU quality assessment likes the SAM_08 server models, the Zhang-Server models, and the SAM-T06-server models. The MQAC quality assessment likes those, plus pipe_int_TS1. Sat Jul 12 10:01:53 PDT 2008 Kevin Karplus I started metaserver predictions for MQAU1 and MQAC1. I also started try4 to attempt to polish try3. I should do metaserver predictions for try2.costfcn and try4.costfcn also, though perhaps those should be MQAX runs that use all servers. Sat Jul 12 10:31:50 PDT 2008 Kevin Karplus All four MQA runs are working on SAM-T08-server_TS1, which doesn't introduce much variety in my predictions. I'll do MQAY1, MQAY2, and MQAY4 runs that specifically exclude the SAM servers. Sat Jul 12 11:00:59 PDT 2008 Kevin Karplus MQAY1 favors Pcons_dot_net_TS3 MQAY2 favors Pcons_dot_net_TS3 MQAY4 favors BAKER-ROBETTA_TS2 Sat Jul 12 18:32:42 PDT 2008 Kevin Karplus MQAX4 is still working on SAM-T08-server_TS1, but the other MQA runs finished hours ago. Sat Jul 12 22:45:06 PDT 2008 Kevin Karplus All the runs from SAM-T08-server are naturally quite similar, as are the other runs built from alignments. The only metaserver models not from SAM-T08-server_TS1 are from BAKER-ROBETTA_TS2 (and Pcons_dot_net_TS3, which seems to be copied from robetta), which has a different choice of what to do with the N-terminus. Sat Jul 12 22:56:53 PDT 2008 Kevin Karplus try5 will try to polish the BAKER-ROBETTA_TS2 models. try6 will try to polish the SAM-T08 models. Sun Jul 13 18:07:10 PDT 2008 Kevin Karplus With try5 and try6 both being fairly successful polishes, what is there left to do? Decide which group to put first? Try to figure out which group is more protein-like? The SAM-T08 models got much higher scores on the MQAC assessment than the BAKER-ROBETTA ones, and that assessment is based mainly on consensus and does not inherently favor SAM-T08 that much. Actually, I think the problem is figuring out what the N and C terminal helices are really doing. Actually, the C-termini of the try5 and try6 models are quite similar, but the N-terminal domains differ. Perhaps I should submit both try5 and try6 to VAST, and see if anything matches in the N termini. try5-opt3 Your VAST Search job was submitted at 07/13/2008 21:21:56(EDT). Request ID: 1012165193163227427 try6-opt6 Your VAST Search job was submitted at 07/13/2008 21:22:48(EDT). Request ID: 388569561564216327 Sun Jul 13 18:24:50 PDT 2008 Kevin Karplus The main VAST hits for the whole chain of try5-opt3 are 1pwuA, 2ejqB=2ejqA, 1tviA, 2epkX=2eplX, 1jakA, 2pmzB, 1oqwA, 2r01A, but only include residues 63-249. The "domain" 57-142 has best hits 2ejqB=2ejqA, 2pmzB, 1wteA, 1xtgA, 1iw7E=2a6hE, 1scmB. The "domain" 143-266 has best hits 1pwuA and 1ynsA=1zs9A VAST did not look for anything for the N-terminal region, that is, before P63. Sun Jul 13 19:58:35 PDT 2008 Kevin Karplus For try6, VAST finds a longer match to 1pwuA, from 32 to 250., also hits to 2h1nA, 1c7kA, 2epkX=2eplX, 2epoA=2eplX, 2ejqB=2eqjA, 1xtgA Domain 1 of try6 (1-60) matches 1a88A and 2o7gB=2o7gA Perhaps I need to look at 1pwuA, before I decide whether the SAM-T08-server or BAKER-ROBETTA_TS2 models are better. Currently, the longer match for SAM-T08-server (try6) is encouraging. Perhaps I should add 1a88A and 2o7gA as extra hits. The structure of try6 does indeed match 1pwuA fairly well (T35=>R590, ..., A249=>S776), and the N-terminal helices are a faithful copy, but try6 does not have the parallel strand I583-V587, which would be roughly V28-L32, which has been wound into a helix in try6. So the question now is: is there a parallel strand there? Should I try to force one, even though the neural nets don't predict one and the burial pattern looks more helical than strand-like? Or is the N-terminus completely messed up? The large insertion for C74-G87 makes the whole N-terminus (everything up to G87) a bit suspect. Also, I notice that G165-A175 doesn't match the corresponding 11 residues in 1pwuA, though the unmatched parts are identical lengths. I wonder if copying from 1pwuA would help. Probably not---the secondary structure prediction doesn't match that well. I don't think I can make much more progress on this protein. The N and C termini are probably wrong, but Im not like to make them better by fussing with them. I'll give up and submit ReadConformPDB T0489.try6-opt3.pdb # MQAX4-opt3.gromacs0.repack-nonPC < SAM-T08-server_TS1 ReadConformPDB T0489.MQAX2-opt3.gromacs0.repack-nonPC.pdb # < BAKER-ROBETTA_TS2 # best rosetta energy ReadConformPDB T0489.try5-opt3.pdb # < MQAY4-opt3 < BAKER-ROBETTA_TS2 ReadConformPDB T0489.try4-opt3.pdb # < try3-opt3 < align(1j7nA) ReadConformPDB T0489.try2-opt3.gromacs0.pdb # < try1-opt3 < align(1yqyA+2e62A) Sun Jul 13 21:05:35 PDT 2008 Kevin Karplus Wait a minute. Maybe I can do something with the N-terminus. The long straightish section in the BAKER-ROBETTA models (A55-T62) might somehow become the parallel strand I saw in 1pwuA. Maybe SheetConstraint L56 L61 F88 I93 Hbond L92 Sun Jul 13 21:32:35 PDT 2008 Kevin Karplus For try7, I've added this sheet constraint with a huge weight, and turned breaks and clashes way down. I'm hoping that undertaker will manage to swing the strand into position, even if it doesn't manage to do so neatly. Mon Jul 14 08:23:52 PDT 2008 Kevin Karplus I goofed on that sheet constraint---I've got the Hbond on the wrong side of the strand for the try5 model! Not only that, but I think that the phase is wrong also. I'll try again with SheetConstraint E60 L64 E90 Y94 Hbond V91 Mon Jul 14 09:58:36 PDT 2008 Kevin Karplus If try8 works, should I make a chimera of its N-terminal with try6, and optimize that (with suitable strand constraints for the new strand)? Mon Jul 14 12:45:57 PDT 2008 Kevin Karplus try8 doesn't quite work, but it's getting closer. I added another constraint to try to get the N-terminal straight part to pair up also. Mon Jul 14 16:47:42 PDT 2008 Kevin Karplus I must have the constraints messed up for try8 and try9, as the two strands are coming out on top of one another in try9. Tue Jul 15 08:02:19 PDT 2008 Kevin Karplus I'd like to line up L61 with V91 (parallel) and P6 with T62 (antiparallel) Hbonds between 60s and 90s should be to I93, so on odd 90s and even 60s. That leaves Hbonds between 0s and 60s to be on odd 60s. OK, that's what I specified for try9, except that I had P6 with E60. I'll try SheetConstraint M2 P6 Q66 T62 Hbond P63 for try10, and up the clashes and breaks to try to get something more reasonable. Tue Jul 15 14:27:15 PDT 2008 Kevin Karplus try10 is beginning to form the sheets I specified, but at horrible cost in breaks and clashes. I'll make one more attempt, with P6, P63, and P95 attempted to line up. Tue Jul 15 14:39:36 PDT 2008 Kevin Karplus If try11 still looks awful, I'll submit the others and give up on getting this sheet. Meanwhile, I'll try predicting a fold for M1-P95. Tue Jul 15 16:37:16 PDT 2008 Kevin Karplus M1-P95 gets no strong hits, but 2e62A (E-value 2.6) and 2q22A (E-value 17) seem to come up fairly consistently. Tue Jul 15 22:27:40 PDT 2008 Kevin Karplus M1-P95/try1 suggests a rather different N-terminal domain than I've been exploring so far, with SheetConstraint L80 L85 I93 F88 Hbond F88 This is not compatible with the try5 model that I've been try to add a N-terminal domain to, but it is compatible with the try6 model. For M1-P95/try2 and M1-P95/try3, I'll try making F88-I93 be an edge strand of the N-terminal domain, with SheetConstraint L80 L85 I93 F88 Hbond F88 SheetConstraint Q57 L61 L80 W84 Hbond W84 Wed Jul 16 00:02:18 PDT 2008 Kevin Karplus Actually, M1-P95/try3-opt3 almost comes up with what may be a better solution: SheetConstraint L56 L61 E79 W84 hbond L59 SheetConstraint L56 E60 Y94 E90 hbond G58 I'll try running M1-P95/try4 and M1-P95/try5 to optimize for that sheet from both existing models (try4) and alignments (try5). If either builds a good model, I'll try slapping it onto try6 and reoptimizing to close breaks. Wed Jul 16 09:12:31 PDT 2008 Kevin Karplus M1-P95/try4 scores better than M1-P95/try5, so there probably isn't a better alignment waiting out there to be captured. The hand sheet constraints are still not scoring all that well. Wed Jul 16 10:57:16 PDT 2008 Kevin Karplus I upped the hand.sheet constraints for M1-P95/try6 and am doing one more polishing run. Then I'll have to see if I can paste it into ./try6. Looking at the M1-P95/try4 and ./try6, I'm not sure this is going to work---the N-terminal helices bump into the C-terminal domain pretty badly, and it would take a lot of work to swing the out of the way. Wed Jul 16 13:12:57 PDT 2008 Kevin Karplus I made a chimera of try6 and M1-P95/try6, but it really does have horrendous clashes. I don't know if I'll be able to make anything sensible out of it, but I'll see what can be done in try12. Wed Jul 16 15:21:22 PDT 2008 Kevin Karplus try12 managed to avoid some of the clashes, but only by moving the sheet out of the domain and tearing off later strands. Not a pretty sight! Wed Jul 16 17:46:41 PDT 2008 Kevin Karplus I think I'll redo try12 as try13, but turn the sheet constraints for the existing sheets way up. Wed Jul 16 20:09:59 PDT 2008 Kevin Karplus try13 gets good sheet costs, but bad breaks and clashes. The helices from N145 on are essentially the same as in try6, but the sheet has been rotated a bit. Wed Jul 16 20:39:16 PDT 2008 Kevin Karplus To keep try13 mostly intact, but move the sheet back into position, I broke try13-opt3 into 2 parts try13-N and try13-C around A141, then superimposed both on try6-opt3 (putting extra weight on the sheet residues). I then made a chimera of these parts, chimera-try13-try6: M1-W132 from T0489.try13-opt3.pdb (try13-N) L133-K154 from T0489.try6-opt3.pdb L155-end from T0489.try13-opt3.pdb (try13-C) I'll try polishing this chimera, reducing clashes and breaks while keeping the sheet. Thu Jul 17 11:43:51 PDT 2008 Kevin Karplus try14 is really messed up. I think I'm going to give up on this attempt to clean up the N-terminus. Thu Jul 17 16:26:40 PDT 2008 SAM-T08-MQAO hand QA T0489 Submitted Thu Jul 17 16:32:28 PDT 2008 SAM-T08-MQAO hand QA T0489 Submitted Thu Jul 17 16:32:28 PDT 2008 SAM-T08-MQAU hand QA T0489 Submitted Thu Jul 17 16:32:28 PDT 2008 SAM-T08-MQAC hand QA T0489 Submitted Fri Jul 18 14:45:52 PDT 2008 Kevin Karplus Maybe I should just do an MQAY14 run, to see if there are any other server models I should polish. Fri Jul 18 19:34:11 PDT 2008 Kevin Karplus MQAY14 is optimizing ACOMPMOD_TS3, which is certainly a server I've never seen come up on the lists before. I'm sure it will be "different", Sat Jul 19 14:05:10 PDT 2008 Kevin Karplus MQAY14 is terrible, not even close to being compact, though it forms a couple of sheets. I'm ready to give up on this target. Sat Jul 26 14:24:43 PDT 2008 Kevin Karplus Maybe I should look at cleaning up some of the internal loops, instead of the termini. For example, I don't much care for T157-G176 of try6-opt3, and C74-F88 is also suspect. Sat Jul 26 16:57:49 PDT 2008 Kevin Karplus For R156-W177 (or maybe K154-E178), I think I like try4-opt3 best. For M71-E90, I could paste try2-opt3.gromacs0 into try6-opt3. MQAX2 and try5 are nicely continuous in this region but have a very different N-terminal solution. Actually, I think I'd prefer just M71-L85 from try2. I'll try making a chimera-try6-try4-try2 model from these pieces. Sat Jul 26 17:23:32 PDT 2008 Kevin Karplus I'm trying to optimize this chimera as try15. Sat Jul 26 19:48:28 PDT 2008 Kevin Karplus try15 gets better burial but worse breaks that try6. I think I like try15 a bit better, despite its slightly higher cost, so I'll try polishing it. Sat Jul 26 20:05:29 PDT 2008 Kevin Karplus try16 attempting to polish try15, putting high weight on restoring the try2 and try4 components that were copied in. Sun Jul 27 11:12:05 PDT 2008 Kevin Karplus try16-opt3 is the new best model, but I just realized that I never put constraints on H149, E150, H153, E212 (and maybe E219, Y205, N145) as I had intended. 1yqyA has a zinc bound there and (2R)-2-{[(4-FLUORO-3-METHYLPHENYL)SULFONYL]AMINO}-N-HYDROXY-2-TETRAHYDRO-2H-PYRAN-4-YLACETAMIDE I'll get the spacing constraints for the zinc site: H149.NE2 H153.NE2 3.03 H149.ND1 H153.ND1 6.30 H153.NE2 E150.OE1 5.33 H153.NE2 E150.OE2 4.09 H149.NE2 E150.OE1 4.21 H149.NE2 E150.OE2 4.33 H153.NE2 E212.OE1 3.54 H153.NE2 E212.OE2 3.84 H149.NE2 E212.OE1 3.53 H149.NE2 E212.OE2 5.17 E150.OE1 E212.OE1 6.85 E150.OE1 E212.OE2 7.75 E150.OE2 E212.OE1 6.52 E150.OE2 E212.OE2 6.93 E219.OE2 H149.ND1 2.89 Y205.OH H149.NE2 5.40 Y205.OH H153.NE2 6.40 Y205.OH E212.OE1 3.43 Y205.OH E212.OE2 4.39 Sun Jul 27 11:52:02 PDT 2008 Kevin Karplus try17 will try to fix the zinc site and also try to fix up a couple of the helices that I think need work. After try17, I should probably run again without the helix fixes. Sun Jul 27 16:02:10 PDT 2008 Kevin Karplus try17 is the new best, and rosetta likes try17-opt3.gromacs0.repack-nonPC I'll do another polishing run, without the attempt to extend the helices this time, and with the zinc site turned up higher. Sun Jul 27 18:19:19 PDT 2008 Kevin Karplus The first two OptConform runs in try18 (with clashes and breaks turned down by a factor of 4 on the first and of 2 on the second) may have been wasted as they polished try17-opt3, but the third run (with the clashes and breaks all the way up again) started over with try17-opt3.gromacs0. There may be some crossover between the models, but I could probably have saved an hour by not scaling down soft_clashes and breaks. I won't want to submit a Rosetta-repacked model as my first model, as Rosetta throws away the carefully constructed zinc site in try17-opt3 or try17-opt3.gromacs0. (There are probably some clashed in the size in try17-opt3, which gromacs relieves without totally destroying the site.) Sun Jul 27 19:17:34 PDT 2008 Kevin Karplus try18-opt3 is new best, and try18-opt3.gromacs0.repack-nonPC is best rosetta energy. I think I've polished this lineage as much as feasible. The qestion now is whether any of the other N-terminal ends are salvagable. Maybe, instead of M1-P95, I should do an M1-N131 subdomain, to include the whole sheet. Sun Jul 27 21:26:21 PDT 2008 Kevin Karplus M1-N131 doesn't get any strong hits (strongest is 2e62A E-value>4.5). This is the same top hit as for M1-P95. The other shared hits in the top 20 are 2q22A (11th for M1-P85, 2nd for M1-N131), 1otjA (5th and 19th) 2db2A (13th and 15th), I may want to do a run with just 2e62A, 2q22A, 1otjA, and 2db2A, if the M1-N131/try1 run doesn't look promising. Sun Jul 27 23:20:13 PDT 2008 Kevin Karplus The M1-N131/try1 run seems to be based mainly on 2q22A, but it manages to mess up part of the alignment that was fairly decent. For M1-N131/try2, I'll try re-optimizing the try1-init model, which seems to come closest to having the intended sheet. For M1-131/try3, I'll use the try2 costfcn, but start from alignments to just 2q22A and 2e62A. Mon Jul 28 08:13:35 PDT 2008 Kevin Karplus The M1-N131/try3 sheets are incompatible with the ones from the main try18 prediction. Mon Jul 28 11:50:13 PDT 2008 Kevin Karplus The M1-N131/try4 run (from 2q22A) is also rather incompatible with the whole-chain predictions. Mon Jul 28 12:05:44 PDT 2008 Kevin Karplus OK, I give up on the N-terminus. I don't believe what I have in the 5 models I'm submitting, but I can't seem to get anything better. ReadConformPDB T0489.try18-opt3.pdb # < try17-opt3.gromacs0 < try16-opt3 < try15-opt3 < chimera-try6-try4-try2 ReadConformPDB T0489.try6-opt3.pdb # < MQAX4-opt3.gromacs0.repack-nonPC < SAM-T08-server_TS1 ReadConformPDB T0489.try5-opt3.gromacs0.repack-nonPC.pdb # < MQAY4-opt3 < BAKER-ROBETTA_TS2 ReadConformPDB T0489.try2-opt3.gromacs0.pdb # < try1-opt3 < align(1yqyA+2e62A) ReadConformPDB T0489.try4-opt3.pdb # < try3-opt3 < align(1j7nA) Mon Jul 28 12:15:50 PDT 2008 Kevin Karplus Submitted with comment: The SAM HMMs had somewhat weak but consistent hits to the d.92.1.* superfamily, the "zincins". When I did metaserver runs, I consistently got the SAM-T08-server_TS1 model as the primary source for the predictions. Only by excluding the SAM servers did I find another hit: BAKER-ROBETTA_TS2. The central core seems pretty good, though there are some insertions and changes of helix length that may have thrown off the alignment. I'm not confident of the N and C termini, but I think that the SAM-server-T08-based model is more convincing (closer to 1pwuA) than the BAKER-ROBETTA-based model. After optimizing for a while, I realized that I had neglected to put in constraints to hold the conserved residues of the zinc-binding site in place, so I copied some distance constraints from 1yqyA for H149, E150, H153, E212, E219, and Y205. I'm still really dubious about the N-terminus, but was unable to construct a model with anything more convincing, given the very tight time constraints of CASP. Model 1 T0489.try18-opt3.pdb # < try17-opt3.gromacs0 < try16-opt3 < try15-opt3 < chimera-try6-try4-try2 # best scoring, has the zinc constraints. # Rosetta likes this backbone, but repacks the residues around # the zinc site. chimera-try6-try4-try2: mostly from T0489.try6-opt3.pdb K154-E178 from T0489.try4-opt3.pdb M71-L85 from T0489.try2-opt3.gromacs0.pdb 2 T0489.try6-opt3.pdb # < MQAX4-opt3.gromacs0.repack-nonPC < SAM-T08-server_TS1 3 T0489.try5-opt3.gromacs0.repack-nonPC.pdb # < MQAY4-opt3 < BAKER-ROBETTA_TS2 4 T0489.try2-opt3.gromacs0.pdb # < try1-opt3 < align(1yqyA+2e62A) 5 T0489.try4-opt3.pdb # < try3-opt3 < align(1j7nA)