Tue Jun 13 09:38:33 PDT 2006 T0329 Make started Tue Jun 13 09:40:13 PDT 2006 Running on orcas.cse.ucsc.edu Tue Jun 13 09:42:29 PDT 2006 Kevin Karplus Fairly good BLAST hits to 2ah5A (24% over 208, 4.1e-10), 2go7A (22% over 218, 1.2e-08), ... Tue Jun 13 12:39:26 PDT 2006 Kevin Karplus The HMMs are also agreeing on the c.108.1.* superfamily, with over 50 templates matching. Which one is the *best* match seems to vary a bit depending on the HMM, though 2ah5A is often near the top for the target HMMs. Problem: the t06 alignment did not get created: Error: (open_outputfile): Could not open file /projects/compbio/tmp/tmp_savings_25499.eps *** Error: /projects/compbio/experiments/models.97/scripts04/target06: command failed: /projects/compbio/bin/i686/makelogo /projects/compbio/tmp/tmp_savings_25499 -i /var/tmp/t06-karplus-orcas.cse.ucsc.edu-25499/iter0_decontam.mod -logo_rel_entropy 1 -logo_savings_output /var/tmp/t06-karplus-orcas.cse.ucsc.edu-25499/iter0_decontam.savings gmake[1]: *** [T0329.t06.a2m.gz] Error 9 Make started Tue Jun 13 12:45:44 PDT 2006 Running on orcas.cse.ucsc.edu I killed the make and started it again on orcas, in hopes that the problem was a transitory one. Tue Jun 13 13:06:10 PDT 2006 Kevin Karplus The problem is that oldmacd is wedged and /projects/compbio/tmp is not available. I hope that the cluster-admin folks can fix it quickly. Make started Tue Jun 13 18:25:41 PDT 2006 Running on cheep.cse.ucsc.edu Restarted the job now that oldmacd is alive again. Make started Tue Jun 13 21:22:52 PDT 2006 Running on cheep.cse.ucsc.edu Tue Jun 13 21:23:48 PDT 2006 Kevin Karplus Found a minor bug in Make.main, so restarted the make. Wed Jun 28 18:12:07 PDT 2006 Martin Madera Can't see T0329.try1-opt2.pdb.gz anywhere! Will quickly run try2 to generate *something*. Started at 18:30 on squawk. Thu Jun 29 00:33:06 PDT 2006 Try2 finished successfully. Like T0330 and T0324, this is a c.108.1.* target. I started working on the other two first, so it's worth looking at their READMEs because they all seem to run into the same problem. Here's my classification of the top BLAST hits for this target: PDB BLAST E Classification & notes 2ah5A 4.1e-10 -A- 2go7A 1.3e-08 -A- 1o08A 1.2e-06 -A- 1rqlA 6.8e-05 elaborated -A- (extra small helix) 1swvA 1.2e-04 elaborated -A- (RMSD 0.7A with 1rqlA above) 2fi1A 2.6e-04 -A- 1fezA 2.6e-04 elaborated -A- (RMSD 0.6A with 1swvA above) 1rdfA 5.8e-04 elaborated -A- 1qq5A 0.004 -B- 2gfhA 0.006 -B- 1te2A 0.006 slightly modified -A- 1zrn 0.006 -B- 1jud 0.006 -B- 1aq6A 0.006 -B- 1qq7A 0.008 -B- I'll look into the HMM matches tomorrow. Sun Jul 2 18:58:01 PDT 2006 Martin Madera Had a look at try2-opt2, what a disaster! But, on the positive side, no chain breaks. I will try an idea that occurred to me last night: try3 will be the same as try2, but I will increase the weight of the dryXX terms in the scoring function, from dry5 15 dry6.5 20 dry8 15 dry12 5 to dry5 100 dry6.5 50 dry8 30 dry12 10. Running on squawk. -- Try4 will be a standard clean-up of the alignments; I will restrict them to -A- or elaborated -A- structures, i.e. 2ah5A|2go7A|1o08A|1rqlA|1swvA|2fi1A|1fezA|1rdfA Running on peep. Sun Jul 2 22:11:35 PDT 2006 Martin Madera Try3 still running, try4 is done. Try4 looks better than try2, but given that try2 is a disaster, that isn't much of an achievement. The structure is based on elaborated -A-, but the packing isn't too good. We can do better than that! The top "elaborated -A-" BLAST hit is 1rqlA. It's a dimer, but unfortunately the interface is on the main domain away from the insertion, so playing with dimers won't help with the insertion. (Same for all other elab.-A- matches.) I will try what Kevin did for T0324, namely polish the model with a few distance constraints to open up the crevice. I will also try a model based on pure -A-. Sun Jul 2 23:03:05 PDT 2006 Martin Madera Try3 finished. I think it's worse than try4 (lots of chain breaks), but much better than try1! So increasing the weight of dryXX may be a good idea, though maybe not as dramatically. -- Try5: polishing try4-opt2, with higher weight for dryXX: dry5 30 dry6.5 30 dry8 20 dry12 5 (instead of: dry5 15 dry6.5 20 dry8 15 dry12 5 ) and the following distance constraints: Constraint G48.CA S190.CA 7.0 8.0 30.0 1 Constraint G50.CA N133.CA 7.0 8.0 30.0 1 Running on peep. -- Try6: like try4, but further restricted the set of structures to: 2ah5A|2go7A|1o08A|2fi1A Running on squawk. Mon Jul 3 03:38:00 PDT 2006 Martin Madera Try6 is a failure, it completely tore apart the insertion. Maybe if I added more alignments (as Kevin suggested in T0330) or increased the dry weights... maybe later. Try5: well, it managed to satisfy the constraint. Except it didn't do what I wanted, it unwound the helix a bit and flipped the loop back instead of moving everything down. Note to self: it's dumb to pick constraints in terms of loops, because they're *flexible*. Pick them in terms of helices! A new set of constraints for try7: Constraint V52.CA N133.CA 9.5 10.5 30.0 1 Constraint T56.CA V89.CA 0.0 6.0 8.09 1 Also reduced the dry weights back to standard. Running on peep. Mon Jul 3 16:53:54 PDT 2006 Martin Madera Try7 finished: V52-N133 is 10.48A, T56-V89 is 8.23A. There's a chain break at V52 -- it didn't shift the loop. The packing looks better than try4, so this is my new favourite. For try8, I will make the T56-V89 constraint more stringent to try and pull the helix further down: Constraint T56.CA V89.CA 0.0 6.0 7.5 5 I will add another constraint to try and force the bottom helices to move closer together: Constraint A60.CA V84.CA 0.0 5.5 7.5 1 and one more to move the loop down with the helix: Constraint G48.CA S190.CA 4.7 5.7 30 1 Running on peep. Mon Jul 3 17:44:35 PDT 2006 I looked at the alignment models (as part of assembling best-models) and I really like model 5 (which is an -A- structure). I've decided to do a chimera of that and try2-opt2 (try2 does the best job so far of the main domain). The work is in chimera/; the insert region is 16-110. Try9 is polishing chimera/chimera.pdb. Running on squawk. Mon Jul 3 19:49:18 PDT 2006 Kevin Karplus Nope, try9 is *not* polishing chimera/chimera.pdb, as try9.under requests, because that is not a complete conformation, and undertaker can only optimize complete conformations. try8, on the other hand, does seem to be improving its cost function on try4. Mon Jul 3 20:00:58 PDT 2006 Kevin Karplus Rosetta best likes repacking try4-opt2 and try6-opt2 (same score) The try9 costfcn likes try7, try5, try8-opt1, try4, try6, try2. The try8 costfcn likes try7, try5, try8-opt1, try4, try3, try6 There doesn't seem to be a try1, because of the undertaker bug that caused crashes around Jun 13. I didn't get it debugged right away, because of the oldmacd RAID failure that day. Mon Jul 3 20:46:48 PDT 2006 Martin Madera Re try9 -- you're right! I thought the alignment model looked too good to be true. Now I understand: the backbone was nice and contiguous, but unfortunately our sequence has insertions with respect to it (which aren't shown in the structure). OK. And when I tried a type -A- structure in try6, it blew up. The best structures so far are in best-models. Try7 is an improved version of try5, no point submitting both. Try2 and try6 blew up and are unlikely to be right; try3 is OK but *lots* of chain breaks in rasmol. Mon Jul 3 21:04:21 PDT 2006 Kevin Karplus By putting the chimeras into the decoys directory, one can usually catch the problems with cost functions that have "missing_atoms" terms. Date: Mon, 3 Jul 2006 20:57:39 -0700 From: Kevin Karplus To: martin madera CC: karplus Subject: T0329 and other targets For the T0329 and homologous targets that you are working on, it looks like the t06 alignment is consistently doing better than the t04 and t2k alignments. You might want to do an initial run, like try1, but with only alignments from the t06 HMMs. One way to do this would be to put all the reasonable templates (basically the top 10 or 20 hits in T0329.t06.best-scores.rdb) into MANUAL_TOP_HITS in the Makefile, do make extra_alignments make read_alignments foreach x (*/read-alignments-scwrl.under) grep -h t06 $x > $x:s/scwrl/t06-scwrl/ end Then include each of the read-alignments-t06-scwrl.under files to read in the alignments in the try.under script. I'll do this as try10 for T0329. ------------------------------------------------------------ Mon Jul 3 21:06:34 PDT 2006 Kevin Karplus try10 started on cheep. Mon Jul 3 21:08:47 PDT 2006 Kevin Karplus Ooops, restarted. Forgot to make try10.costfcn use only T0326.t06.dssp-ehl2.constraints. It probably doesn't matter for *this* target. Mon Jul 3 21:12:21 PDT 2006 Kevin Karplus Martin wanted to know *why* I believed that t06 was doing som much better on this target. Good question---I may be confusing targets (I've done that a lot today). Let's look at some statistics: t06 alignment 6588 sequences 44 from pdb t04 alignment 3005 sequences 20 from pdb t2k alignment 2343 sequences 19 from pdb OK, so t06 is more sensitive, but is the alignment better? t06 has 14 key residues, all of which are matched t04 has 14 key residues, all of which are matched t2k has 16 key residues, all of which are matched Hmm---they all seem to be about equally good on these measures. Do they choose different templates? Top 5 templates t06 2ah5A 1zrn 2gfhA 2fdrA 1te2A t04 1jud 2ah5A 2gfhA 2fdrA 1te2A t2k 2ah5A 2gfhA 1te2A 2fdrA 2go7A There are some slight differences (probably due to the differences in the multiple alignments as the extra hits are in the *smaller* libraries). OK, I have no reason at all to believe that T06 will do better than the others on this target. I must have been thinking of some other target. Oh well, I might as well let try10 run anyway. Mon Jul 3 21:33:01 PDT 2006 Martin Madera Try11: copied from try6. This is another attempt to get an -A- structure out of undertaker, somehow. The structures I'm interested in are: 2ah5A|2go7A|1o08A|2fi1A Kevin already added all the structures to MANUAL_TOP_HITS as part of try10, so all I did was modify try11.under to: InfilePrefix 2ah5A/ include read-alignments-scwrl.under InfilePrefix 2go7A/ include read-alignments-scwrl.under InfilePrefix 1o08A/ include read-alignments-scwrl.under InfilePrefix 2fi1A/ include read-alignments-scwrl.under for the last TryAllAlign. We'll see if this helps; if not, I'll increase the dry terms. Running on squawk. Mon Jul 3 22:26:28 PDT 2006 Try8 finished. The constraints were: Constraint V52.CA N133.CA 9.5 10.5 30.0 1 10.49 / 7.67 Constraint T56.CA V89.CA 0.0 6.0 8.09 1 8.24 / 7.96 Constraint A60.CA V84.CA 0.0 5.5 7.5 1 8.66 / 8.35 Constraint G48.CA S190.CA 4.7 5.7 30 1 4.66 / 4.91 and I added the actual distances for try7 / try8 (taken from try8-opt2.constraints for try8 and rasmol for try7). For the first constraint, try7 has a break, so that doesn't count; otherwise it is an improvement. Try12: same as try8, but - bumping up the constraint weights to 5 (from 1), - increasing constraints to 40 (from 20), - removing the helix constraints in the region of the insertion (16-109) Mon Jul 3 22:45:27 PDT 2006 Kevin Karplus Martin, I'm confused about what you just said. Are you saying that try8 is an improvement over try7 or not? Should it replace try7 in superimpose-best.under? try7-opt2 does score better with the try8 costfcn. Does that mean anything? try7-opt2 also scores best with the try12 costfcn. Is there a reason why try12 starts only from try4-opt2, and not from all existing models? Mon Jul 3 22:55:34 PDT 2006 Martin Madera Try8 looks better than try7, but scores worse. However, try7 is cheating (on try8's cost function) -- it has a massive break that allows it to score better on an important constraint (the first one); note that try7 does worse on breaks. Because it scores worse, I thought it wasn't worth a resubmission. But now I realize that you haven't submitted it yet! Updated best-models. Re starting from try4, hmm... I'd like to avoid accumulated drift from the alignment structures. Also, the breakdown of the various attempts is as follows: failure: try9 blow up: try2, try6 breaks: try3 bad loop: try5 ok: try4, try7 (haven't looked at try10 yet) -- so really the only models that are decent are try5 (but I'd prefer undertaker not seeing the loop) and try7 -- but then I'm trying to move it further from try4 than try7 is. Mon Jul 3 23:14:08 PDT 2006 Kevin Karplus try10 scores very well with the unconstrained cost fcn, and it has a rather different structure for the helical domain. I'd like to include it in the mix---what order? I think that try8-opt and try4-opt2 are so close to each other that we could probably drop try4 out. I'll do a tentative submission of try8-opt2 try10-opt2 try5-opt2 align5 t06 1swvA align2 t04 2gfhA No---try5 is too close to try8 also. Do we have a *different* model worth including? Mon Jul 3 23:21:59 PDT 2006 Martin Madera No, all of my attempts so far have been elaborations of try4. I'd do -- try8-opt2 try10-opt2 chimera.pdb (but that may be close to try10?) model 5 model 2 SORRY being stupid: -A- ... chimera elab -A- ... all my successful tries, model 5 -B- ... model 2, try10 so try8-opt2 try10-opt2 chimera.pdb model 5 model 2 is a good list. Mon Jul 3 23:31:08 PDT 2006 Kevin Karplus I'm confused again---I thought that the chimera had an unfixable insertion in it, which is why try9 blew up. I'm starting a try13, which will try polishing (no constraints) from the gromacs models. I suspect that it will concentrate on try8-opt2, but I could be surprised. Mon Jul 3 23:38:00 PDT 2006 Kevin Karplus Martin, are you aware that try11 is using all the alignments (from all-align.a2m)? Also, you did not save the try11.under file before running try11, so what you are running read only one of the read-alignments-scwrl files. try13 does indeed seem to concentrate on improving try8-opt2.gromacs. Perhaps I should do another run just from try10-opt2.gromacs. Mon Jul 3 23:42:56 PDT 2006 Martin Madera Well spotted re try11 -- I noticed that I had an open window with an unsaved (and edited) try11.under file! So I restarted the try11 run and was about to write it in the README. (And indeed, it only read one of the four read-alignments-scwrl files.) As far as I can see, try11 should now be using edit2.all-align.a2m, not all-align.a2m. Thanks for the gromacs run, I was about to do something similar myself. I think try10 is a blind alley -- look at the actual structure! It may score well but it looks wrong. (But then again, who knows.) Mon Jul 3 23:51:13 PDT 2006 Kevin Karplus I started try14 to optimize try10-opt2.gromacs0 I don't see what is so terrible about try10-opt2. There is a large cavity, but there is a large cavity in some of the templates also. If you use the T0329.t06.str2-color.rasmol script, you'll see that the secondary structure prediction (from that network) matches pretty well. I have made a submission with comment For T0329 (as with the several homologous targets), the alpha/beta domain was easily modeled, but we had two main choices for the helical domain (arbitrarily A and B). Model 1 is try8-opt2, our current best for domain type A. Model 2 is try10-opt2, our current best for domain type B. Model 3 is try4-opt2.repack-nonPC, the backbone of try4-opt2, with sidechains repacked by rosetta. It is the repacking that rosetta likes best. (Try4-opt2 was optimized to form model 1) Model 4 is sidechain replacement by SCWRL on an alignment to 1swvA (type A). Model 5 is sidechain replacement by SCWRL on an alignment to 2gfhA (type B). Tue Jul 4 08:15:09 PDT 2006 Kevin Karplus try11-try14 have now finished. With the unconstrained or try13=try14 costfcns, try14-opt2 (based on try10-opt2) scores best. try10-opt2 is next with unconstrained, but try13-opt2 is next with try13=try14. With the try12 costfcn, the order is try12, try7, try3, try13, try8. The question now is whether the constraints satisfied by try12 should outweigh the breaks and clashes. With the try11 costcn, the order is try14, try13, try11, try10, try7, try8 (try11 is optimized from alignments, particularly 2ah5A). Rosetta best likes repacking try13-opt2, but try14-opt2 also beats the old best (try4-opt2). I did not much care for either try11 or try12---the helices have been unwound or torn apart. I think we should now submit try14-opt2 try13-opt2 something (try4-opt2.repack-nonPC?) align1 t04 2ah5A align5 t06 1swvA Martin, please tell me what to use for the 3rd model. Also tell me if there is some reason to prefer align2 (2ghfA) to align1 (2ah5A). Since the soft deadline is noon today, I'll submit the list above now. Tue Jul 4 08:41:39 PDT 2006 Kevin Karplus So submitted. Tue Jul 4 14:30:28 PDT 2006 Martin Madera Oops, slept longer than I had intended. But full of energy now! First, let me make sense of the runs. Try11: I started that one. It's an -A- structure, which have proved very difficult with this target [see the history for try4 (which produced elab-A-), try6 and try9]. And it's the best -A- structure we have so far; the bulk of it is well packed, it's just that front helix that's screwed up. (Expected: that's where we have a large insertion wrt type -A- structures.) I think playing around with -A- structures is important, because remember they're our top BLAST hits: PDB BLAST E Classification & notes 2ah5A 4.1e-10 -A- 2go7A 1.3e-08 -A- 1o08A 1.2e-06 -A- 1rqlA 6.8e-05 elaborated -A- (extra small helix) 1swvA 1.2e-04 elaborated -A- (RMSD 0.7A with 1rqlA above) 2fi1A 2.6e-04 -A- 1fezA 2.6e-04 elaborated -A- (RMSD 0.6A with 1swvA above) 1rdfA 5.8e-04 elaborated -A- 1qq5A 0.004 -B- 2gfhA 0.006 -B- 1te2A 0.006 slightly modified -A- 1zrn 0.006 -B- 1jud 0.006 -B- 1aq6A 0.006 -B- 1qq7A 0.008 -B- I'll try moving the helix with ProteinShop tomorrow. Try12: I started that one. Elaborated -A-. It was a further attempt in the try4, try7, try8, try12 series. I don't like what it did; I'll have a proper look later, but for now try7/try8 are better. Try13: Kevin's run, polishing try2..10-opt2.gromacs0.pdb.gz, no constraints. As expected, resulted in an elab-A- structure. It looks OK (about the same as try7/try8), but nothing to get too excited about. However if it scores well, it's probably the best model in the series. Try14: Kevin's run, polishing try10-opt2.gromacs0.pdb.gz, no constraints. Looks very similar to try10, but I'm sure the devil is in the detail. I agree with the list that Kevin submitted, those are the best structure we have so far. But I think try11 shows a lot of promise. Apart from the helix that's sticking out (our insertion wrt the classic -A- structures), the packing is the best so far. This isn't surprising, because -- to reapeat -- those are out top BLAST hits. Kevin, could you have a look and tell me what you think? Tue Jul 4 16:11:48 PDT 2006 Martin Madera **OOOPS** managed to overwrite the .under and .costfcn for try7 by doing: -bash-3.00$ cp try5.costfcn try7.costfcn -bash-3.00$ cp try5.under try7.under -bash-3.00$ emacs try7.under I thought I was in the T0330 directory, but I was still in T0329. Tue Jul 4 16:52:37 PDT 2006 Kevin Karplus try7.under and try7.costfcn restored from my home computer. I find that using emacs to do the copies (in the directory listing, using 'C') helps avoid overwriting, since it asks if you intend to overwrite. Even better, I go to the file try7.under in emacs, then insert file try5.under. If try7.under already exists, I see it before I do the copy. Tue Jul 4 16:56:55 PDT 2006 Kevin Karplus I'm still not impressed with try11, but if you can clean up the helices that are disordered, I'm certainly willing to include it. I wouldn't put too much faith in the HMMs for this region---they are recognizing mainly the other domain. Tue Jul 4 17:35:18 PDT 2006 Martin Madera Thanks for those try7 files! Re try11, do restrict 1-64,90-239 I think that part looks very good. The problem is what to about select 65-89 color red select * Tue Jul 4 17:44:34 PDT 2006 Martin Madera So, what went wrong with try12? The constraints were: Constraint G48.CA S190.CA 4.7 5.7 30 5 8.58 Constraint V52.CA N133.CA 9.5 10.5 30.0 5 10.7 Constraint T56.CA V89.CA 0.0 6.0 8.09 5 6.0 Constraint A60.CA V84.CA 0.0 5.5 7.5 5 5.5 and I appended the actual distances (taken from try12-opt2.constraints; checked rasmol, rasmol agrees). The first two, G48-S190 and V52-N133, were an attempt to push the loop further down. And they backfired, because it pushed the loop to the left instead. Maybe "pushing away" is a bad idea, because there are lots of directions in which this can be done. The second two, T56-V89 and A60-V84, were an attempt to slide the helix further down. And they worked (albeit at a cost of introducing breaks in the chain around T56 and A60). Unfortunately this made the cavity on the other side of the helix even worse. I think I need to add in more constraints to try and close it. Wed Jul 5 06:48:14 PDT 2006 Kevin Karplus I've never had much luck with "keep-away" constraints. As you say, there are so many directions to move in. Strong constraints are a bit dangerous, as they can override all the terms that are trying to keep the model protein-like. If you have two models that each have good parts, you can sometimes make a chimera by superimposing them on a shared section, then doing cut and paste with the editor. Are there any models that have 65-89 that are compatible with try11 that you could paste in? It looks to me like the spacing may make it difficult to fill in the helices. Mon Aug 21 15:33:32 PDT 2006 Kevin Karplus The outer domain is well modeled, but we had some trouble with the inserted helical domain. Our best model is try11-opt2.gromacs0, followed closely by try11-opt2. It looks like Martin's intuition that try11 looked good was better than mine. Our best submitted model was model2 (try13-opt2), which was somewhat worse than try11-opt2. This target was an outlier, because the align1 GDT (64.64%) was much better than the model1 GDT (46.76%). Note that the model 2 GDT of 60.36% was still not as good as align1, and even our best model (try11-opt2.gromacs0) still only had a GDT of 64.64%.