Tue Jun 6 09:11:55 PDT 2006 T0319 Make started Tue Jun 6 09:12:25 PDT 2006 Running on orcas.cse.ucsc.edu Tue Jun 6 09:16:34 PDT 2006 Kevin Karplus No good hits with BLAST, best is 1uedA with E-value 3.1 This may require new-fold methods. Luckily it is only 135 residues long. Tue Jun 6 14:24:35 PDT 2006 Kevin Karplus The HMMs don't really find anything either. The best hits are to g.41.*) These are rubredoxin-like metal-bound folds (zinc beta-ribbon, rubredoxin, and Microbial and mitochondrial ADK, insert "zinc finger" domain). I only see 2 conserved CYS in t06: C11 and C112, though C16 and C115 may also be somewhat conserved. Are these all spaced around one ion? I should look at the rubredoxin folds such as 1rlyA, 1iro, 1zin, ... Sun Jun 25 08:59:27 PDT 2006 Kevin Karplus Soft deadline is Tuesday, and no one has looked at this target yet! The server scores highlight the usual suspects: SAM_T06_server_TS1, Pmodeller6_TS3, ROBETTA_TS4, Pmodeller6_TS2, Pmodeller6_TS4, Pcons6_TS1, ROBETTA_TS2, ... None of the server models look very good. There is a bit of anti-parallel sheet in Pmodeller6_TS2 from S104 to K122 that may be worth stealing. There is some very nice sheet in Pcons6_TS1 and ROBETTA_TS2. We may want to steal those sheet constraints and use VAST to figure out what template they used, since all of our alignments into undertaker seem to have been short, scrappy things. Sun Jun 25 09:14:19 PDT 2006 Kevin Karplus I sent the Pcons6_TS1 model to VAST as Request ID: 546346314296527426 Sun Jun 25 09:33:38 PDT 2006 Kevin Karplus Somewhat surprisingly, VAST found no real hits for the Pcons6_TS1 model---it appears that the sheet was formed without the aid of a template! I wish our ab-initio serch strategies were capable of that. At first I was a bit dubious about two servers coming up with the same sheet, as Pcons and Robetta found the same model, but then I realized that the models were *identical* and Pcons is really a meta-meta-server: METHOD Pcons6 METHOD ------------- MODEL 1 PARENT N/A REMARK UNIQ NAME /afs/pdc.kth.se/home/b/bjornw/casp7/targets/T0319/pcons5/pcons5.T0319. 19840//modeller/T0319.none.robetta.2.pdb Sun Jun 25 23:37:13 PDT 2006 George Shackelford I did a search for ehl2 (from t06.str2) matches and found these. id score per residue 5S 10N 10N 1f2tB 108.432 0.8032 1ewxA 107.567 0.796793 1srvA 107.428 0.795763 1y2qA 107.349 0.795178 1o8xA 107.289 0.794734 1qk8A 107.272 0.794608 1i5gA 107.041 0.792897 1jkeA 106.916 0.791971 1wpuA 106.887 0.791756 1pc6A 106.809 0.791179 1j7gA 106.559 0.789326 1sqlA 106.541 0.789193 2a2lA 106.474 0.788697 1f4pA 106.452 0.788534 1nmeA 106.384 0.78803 1vmfA 106.378 0.787986 1em8A 106.375 0.787964 2fy6A 106.358 0.787838 1k2xB 106.334 0.78766 1th8A 106.234 0.786919 I need to redo with log odds scoring (sum of logs), but for now I may as well take what I can get. I'll add to manual top hits and get alignments. I think I'll break them into two groups just to get started. I ran try2 with the top ten, try3 with the next ten and rr.constraints (max on that). I ran try3 "" try5 "" rr.0.2.constraints Mon Jun 26 02:35:29 PDT 2006 George Shackelford What I notice most was that despite a strong prediction for a strand at the start, the resulting matches tended to have a helix at the beginning. That disturb me enough to put log scoring in. Now I get: peep research/alphabetsoup> alphabetmatch -t T0319 -db pdbaanr.stride_str2.db --limit 20 --gapextend 1.5 # program: alphabetmatch # George Shackelford # # Target: T0319 # length: 135 # range: 125 to 148 # id score per residue 5S 10N 10N 1r5tA 200.331 1.48394 1y2qA 198.214 1.46825 1f2tB 195.919 1.45125 1jtkA 192.026 1.42242 1uwzA 190.98 1.41467 2f2eA 190.848 1.41369 1kllA 190.658 1.41228 1o8xA 190.075 1.40796 1oniA 189.325 1.4024 1ir21 188.609 1.3971 1i5gA 186.015 1.37789 1pf5A 185.229 1.37207 1qahA 185.001 1.37038 1ewxA 184.994 1.37033 1wn5A 184.962 1.37009 1qk8A 184.78 1.36874 1vmfA 183.197 1.35701 1fjgI 183.156 1.35671 2anxA 182.505 1.35189 1fjgK 182.457 1.35153 The top is reshuffled a bit and some have dropped off e.g. 1srvA. 1y2qA looks like a good match. I think I'll be redoing stuff tomorrow with these new values (and rr.0.2.constraints). Mon Jun 26 17:03:39 PDT 2006 George Shackelford So I did a try6 and try7 with the top ten and next top ten from the above list. Try6 did well, but it is foamy and still has that awful helix at the start. I've cranked up the weight on that strand and extended it as well. I've also taken the main sheet from try1 and I have adjusted it to tighten a hairpin: #try1 sheets SheetConstraint (T0319)A106 (T0319)R113 (T0319)N123 (T0319)G116 hbond (T0319)G108 5 # This next one looks rather small. Let's take it out for now # SheetConstraint (T0319)M110 (T0319)C112 (T0319)I82 (T0319)E84 hbond (T0319)K111 1 # I shifted this one slightly to tighten up the hairpin. SheetConstraint (T0319)H117 (T0319)I121 (T0319)L130 (T0319)P126 hbond (T0319)I118 5 Try8 is using try1.under as its try8.under, so we can do a restart (and try1 scores well). Try9 is using try6.under as its try9.under. Try6 scores well but I don't believe it. Try8 running on peep. The soft deadline is tomorrow. This is currently the best list I have (using unconstrained): Try1-opt2 decent sheet, appealing Try3-opt2 best scoring try4-opt2 well scoring (there's a pattern here) try6-opt2 well scoring - a bit like try4 in appearance SAM_T06_server_TS1 fifth for now - try8 is coming on strong! Frankly I don't really buy any of these, but they at least look good. I may adjust this later... The new version The soft deadline is tomorrow. This is currently the best list I have (using unconstrained): Try1-opt2 decent sheet, appealing Try3-opt2 best scoring try4-opt2 well scoring (there's a pattern here) try6-opt2 well scoring - a bit like try4 in appearance try8-opt2 interesting. Bad breaks, but interesting. Mon Jun 26 21:21:20 PDT 2006 Kevin Karplus I'll try going with George's latest list, but I'll have to look at the models closely before the hard deadline. George still hasn't fixed his .cshrc file so that the sort-by-rosetta script will work. This is just a simple matter of adding setenv SHELL /bin/tcsh After making decoys/grep_best_rosetta, I see that Rosetta most likes repacking, try7, then try6, try1, try8. I'll have to look to see why it likes try7. The unconstrained costfcn prefers try3, try1, try8, try5, try4, try6, try2, try9, but doesn;t like try7's lack of beta Hbonds. Mon Jun 26 21:32:36 PDT 2006 George Shackelford As I mentioned in response the first time Kevin mentioned the lack of "setenv SHELL /bin/tcsh," I had inserted that only to find that my KDE desktop failed to start up. I'll find out why one of these days. In the meantime I've got to make sure I can boot up at school. Mon Jun 26 21:44:35 PDT 2006 Kevin Karplus I'll submit the models George has listed, but I'm not really sure why he is generating so many new templates to look at, when we haven't explored all the templates that at least scored marginally well with the HMMs. I don't think that we have any evidence that searching with secondary-structure only is even as good as using as using the HMMs, much less better. I think that we may want to try to cluster the CYS and HIS residues for this target. Residues C11, C16, C112, and C115 are particularly likely to come together for metal binding. I wonder about the role of the conserved W56 also. Sat Jul 8 16:28:36 PDT 2006 George Shackelford Whenever I see a conserved trytophan, I always think of its size. I've seen conserved W's before; I should see if there is some function that they get involved in. I've made hard copies of some of the secondary predictions to see if there are any other sheets that come to mind besides what I've found above. This is more challenging since we have a dimer. Does this imply a homo-dimer? Sun Jul 9 16:02:38 PDT 2006 George Shackelford No, no, this isn't a dimer. That's T0343. From More Info: Subunit of an adoMet-dependent tRNA methyltransferase (MTase) complex (Trm11p-Trm112p), required for the methylation of the guanosine nucleotide at position 10 (m2G10) in tRNAs. So we have a piece of a large complex. Great. I can check the list of templates for such structures. Also I could look for other MTase complexes or parts in the PDB. First I need to review all existing tries and their source(s). During that, I can get a list of best sheets, helices, and strands. I shouldn't forget to check Sun Jul 9 21:13:20 PDT 2006 George Shackelford Starting to analyze the heritage of each try. try1 initial run, T0319.try1-al7+all-align.a2m:1wdjA try2 Mon Jul 10 16:03:24 PDT 2006 George Shackelford Nevermind. Superimposing shows that the top five models are quite different from each other. Little reason to worry about heritage. The big problem is none look as good as we'd like. I need to clean up or replace the ones we have. I need to get onto doing T0343 dimers ASAP. try3 pretty but unreal - forget it for now try1 has a chance to join the last strand to the starting strand. SheetConstraint (T0319)A106 (T0319)R113 (T0319)N123 (T0319)G116 hbond (T0319)G108 1 SheetConstraint (T0319)M110 (T0319)C112 (T0319)I82 (T0319)E84 hbond (T0319)K111 1 SheetConstraint (T0319)H117 (T0319)I121 (T0319)P131 (T0319)N127 hbond (T0319)I118 1 try8 try5(?) try4 try6 ReadConformPDB T0319.try1-opt2.pdb ReadConformPDB T0319.try3-opt2.pdb ReadConformPDB T0319.try4-opt2.pdb ReadConformPDB T0319.try6-opt2.pdb ReadConformPDB T0319.try8-opt2.pdb I could use a "build_sheet" program that can take a strands and build a sheet based on near predictions and orientation of "up","down", or "neither." build_sheet -target T0319 -orient up -dist t06 < old.sheet > fixed.sheet SheetConstraint A106 R113 N123 G116 hbond A106 SheetConstraint H117 I121 P131 N127 hbond I118 SheetConstraint N127 P131 K10 T6 hbond N127 Let's give it a try. try10 using try1 under and costfcn with the constraint above added. I also focused on using the t06.str2.constraints and I cranked down the rr.constraints. try10 running on vashon Tue Jul 11 21:49:41 PDT 2006 George Shackelford try10 fought to keep the starting strand from forming. Great. I find that it locked in on 1w2lA. I have commented out include T0319.t04.undertaker-align.under because it seems t04 is the source for this template. At least I think so. I've cranked up the weight to 17 on the starting strand. Let's see it beat us now! try11 running on vashon I'm just going to have to try and see if I can get something out of two templates based on ribosomal subunits. Sure, just another wild shot but if I keep meeting resistance from trying to make sheets that might work... try12 running on vashon Thu Jul 13 11:21:36 PDT 2006 George Shackelford try11 is doing it best to meet the constraints but it is blowing up. Very stringy. try12 has bad breaks but it forms a new fold from the others. I've made a modified unconstrained to see which ones have matched a basic constraints. using constrained.costfcn Time to check our superimposing. ReadConformPDB T0319.try1-opt2.pdb ReadConformPDB T0319.try3-opt2.pdb ReadConformPDB T0319.try4-opt2.pdb ReadConformPDB T0319.try6-opt2.pdb ReadConformPDB T0319.try8-opt2.pdb --------- Damn. I just lost a lot of comments here when I reloaded. Briefly, I did a "constrained.costfcn" and did a score-all. What we have above is what we get... --------- except for moving try8 up, we have the same best 5. I tried changing constrained.costfcn by removing the rr.0.1.contraints and including the original dssp-ehl2.constraints. All that happened was a some reordering of the top 5. I even replaced rr.0.1.constraints with rr.constraints and again we got the same top 5. The weight for constraints is 10; we could get more impact from increasing it, but I'm not going there. Thu Jul 13 13:35:07 PDT 2006 George Shackelford try11 was an effort to straighten out what appears to be a strand at the start F3-L10. However there is a helical signal in the middle; this could just be a helix and it keeps getting folded as a helix. I've now run score-all+servers.constrained.pretty. Our five score much better than anyone else and not just because of the constraints. ROBETTA_TS2 and _TS4 are the two best (besides meta-servers) and they both score well w.r.t. our constraints. I looked at them and while I found them different, they were very foamy. I just don't see any hints from there. Over all, this target is a difficult. It is part of a larger complex and the interface(s) are going to be hydrophobic. No real matches anywhere. Thu Jul 13 17:51:19 PDT 2006 Kevin Karplus OK, I see a list of 5 models to submit, but no clear notes on how we created them. Was any ab initio work done, or just desperate fold recognition? George never created and looked at the grep-best-rosetta summary, which often shows a quite different view of which models are best. (Rosetta cares much more about clashes and breaks than undertaker does.) I am remaking the T0319.do1 through T0319.do9 targets, so that we can evaluate the gromacs0.repack-nonPC models. While that is running, I'll try to extract the history of the individual targets from the log files. try1 from alignments, probably 1wdjA try2 from alignments, probably 1wpuA try3 from alignments, probably 1sqlA try4 from alignments, probably 1f2tB try5 from alignments, probably 1sqlA try6 from alignments, probably 1f2tB try7 from alignments, probably 1wn5A try8 from alignments, probably 1wdjA try9 from alignments, probably 1ir21 try10 from alignments, probably 1w2lA try11 from alignments, probably 2cc0A try12 from alignments, probably 1fjgI George favors templates try1 1wdjA try3 1sqlA try4 1f2tB try6 1f2tB try8 1wdjA Rosetta likes best try7, try1, try4, try8, try6, try2 (all gromacs0.repack-nonPC) try1 costfcn likes try1, try8, try4, try6, try9, try10 I'll add try[27]-opt2.gromacs0.repack-nonPC and try9-opt2 to the superimpose-best.under, and see which 5 models to submit. (try1 and try8 may be too similar, as may try4 and try6) Thu Jul 13 18:13:28 PDT 2006 Kevin Karplus The models Geroge chose are quite different---the templates are probably only providing small pieces of supersecondary structure, so the insertion of several alignments results in quite different structures. try2 may be plausible as may try9. try7 I reject, because it is almost all helical where strands are predicted (and has a strand where a helix is predicted!). Thu Jul 13 18:18:06 PDT 2006 Kevin Karplus I made a "secondary.costfcn" like unconstrained, but with the constraints from dssp-ehl2 included (not sheet constraints or rr constraints). secondary prefers try1, try4, try6, try8, try3, try5 which has the same 5 on top the George preferred, though in a different order. (George had try3 moved up to the second position, otherwise they are the same.) After looking at them, I think I'll move try3 to the end and submit. Thu Jul 13 18:29:41 PDT 2006 Kevin Karplus I resubmitted, even though this was just a reordering of the models we had done for the preliminary submission. Mon Oct 9 12:00:30 PDT 2006 Kevin Karplus Unless I copied the PDB file name wrong, everyone did badly on this target. The secondary structure and burial preidctions are fairly consistent with the correct model, so I don't think I have a typo. Our best model is try10-opt1-scwrl (not submitted), but its GDT is only 15%.