Tue Jun 13 09:38:33 PDT 2006
T0329
Make started Tue Jun 13 09:40:13 PDT 2006
Running on orcas.cse.ucsc.edu

Tue Jun 13 09:42:29 PDT 2006 Kevin Karplus

Fairly good BLAST hits to 2ah5A (24% over 208, 4.1e-10), 2go7A (22%
over 218, 1.2e-08), ...


Tue Jun 13 12:39:26 PDT 2006 Kevin Karplus

The HMMs are also agreeing on the c.108.1.* superfamily, with over 50
templates matching.  Which one is the *best* match seems to vary a bit
depending on the HMM, though 2ah5A is often near the top for the
target HMMs.

Problem: the t06 alignment did not get created:
Error: (open_outputfile):  Could not open file /projects/compbio/tmp/tmp_savings_25499.eps
*** Error: /projects/compbio/experiments/models.97/scripts04/target06: command failed: /projects/compbio/bin/i686/makelogo /projects/compbio/tmp/tmp_savings_25499 -i /var/tmp/t06-karplus-orcas.cse.ucsc.edu-25499/iter0_decontam.mod -logo_rel_entropy 1 -logo_savings_output /var/tmp/t06-karplus-orcas.cse.ucsc.edu-25499/iter0_decontam.savings
gmake[1]: *** [T0329.t06.a2m.gz] Error 9

Make started Tue Jun 13 12:45:44 PDT 2006
Running on orcas.cse.ucsc.edu

I killed the make and started it again on orcas, in hopes that the
problem was a transitory one.

Tue Jun 13 13:06:10 PDT 2006 Kevin Karplus

The problem is that oldmacd is wedged and /projects/compbio/tmp is not
available.  I hope that the cluster-admin folks can fix it quickly.

Make started Tue Jun 13 18:25:41 PDT 2006
Running on cheep.cse.ucsc.edu

Restarted the job now that oldmacd is alive again.

Make started Tue Jun 13 21:22:52 PDT 2006
Running on cheep.cse.ucsc.edu

Tue Jun 13 21:23:48 PDT 2006 Kevin Karplus
Found a minor bug in Make.main, so restarted the make.


Wed Jun 28 18:12:07 PDT 2006 Martin Madera

Can't see T0329.try1-opt2.pdb.gz anywhere! Will quickly run try2 to
generate *something*. Started at 18:30 on squawk.


Thu Jun 29 00:33:06 PDT 2006

Try2 finished successfully.

Like T0330 and T0324, this is a c.108.1.* target. I started working on
the other two first, so it's worth looking at their READMEs because
they all seem to run into the same problem.

Here's my classification of the top BLAST hits for this target:

PDB	BLAST E	Classification & notes
2ah5A	4.1e-10	-A-
2go7A	1.3e-08	-A-
1o08A	1.2e-06	-A-
1rqlA	6.8e-05	elaborated -A- (extra small helix)
1swvA	1.2e-04	elaborated -A- (RMSD 0.7A with 1rqlA above)
2fi1A	2.6e-04	-A-
1fezA	2.6e-04	elaborated -A- (RMSD 0.6A with 1swvA above)
1rdfA	5.8e-04	elaborated -A-
1qq5A	0.004	-B-
2gfhA	0.006	-B-
1te2A	0.006	slightly modified -A-
1zrn	0.006	-B-	
1jud	0.006	-B-
1aq6A	0.006	-B-
1qq7A	0.008	-B-

I'll look into the HMM matches tomorrow.


Sun Jul  2 18:58:01 PDT 2006 Martin Madera

Had a look at try2-opt2, what a disaster! But, on the positive side,
no chain breaks.

I will try an idea that occurred to me last night: try3 will be the
same as try2, but I will increase the weight of the dryXX terms in the
scoring function, from

    dry5 15 dry6.5 20 dry8 15 dry12 5 

to

    dry5 100 dry6.5 50 dry8 30 dry12 10.

Running on squawk.

--

Try4 will be a standard clean-up of the alignments; I will restrict
them to -A- or elaborated -A- structures, i.e.

2ah5A|2go7A|1o08A|1rqlA|1swvA|2fi1A|1fezA|1rdfA

Running on peep.


Sun Jul  2 22:11:35 PDT 2006 Martin Madera

Try3 still running, try4 is done.

Try4 looks better than try2, but given that try2 is a disaster, that
isn't much of an achievement. The structure is based on elaborated
-A-, but the packing isn't too good. We can do better than that! The
top "elaborated -A-" BLAST hit is 1rqlA. It's a dimer, but
unfortunately the interface is on the main domain away from the
insertion, so playing with dimers won't help with the insertion. (Same
for all other elab.-A- matches.)

I will try what Kevin did for T0324, namely polish the model with a
few distance constraints to open up the crevice. I will also try a
model based on pure -A-.


Sun Jul  2 23:03:05 PDT 2006 Martin Madera

Try3 finished. I think it's worse than try4 (lots of chain breaks),
but much better than try1! So increasing the weight of dryXX may be a
good idea, though maybe not as dramatically.

--

Try5: polishing try4-opt2, with higher weight for dryXX:

    dry5 30 dry6.5 30 dry8 20 dry12 5

(instead of:

    dry5 15 dry6.5 20 dry8 15 dry12 5 )

and the following distance constraints:

Constraint	G48.CA	S190.CA		7.0 8.0 30.0	1
Constraint	G50.CA	N133.CA		7.0 8.0 30.0	1

Running on peep.

--

Try6: like try4, but further restricted the set of structures to:

2ah5A|2go7A|1o08A|2fi1A

Running on squawk.


Mon Jul  3 03:38:00 PDT 2006 Martin Madera

Try6 is a failure, it completely tore apart the insertion. Maybe if I
added more alignments (as Kevin suggested in T0330) or increased the
dry weights... maybe later.

Try5: well, it managed to satisfy the constraint. Except it didn't do
what I wanted, it unwound the helix a bit and flipped the loop back
instead of moving everything down. Note to self: it's dumb to pick
constraints in terms of loops, because they're *flexible*. Pick them
in terms of helices!

A new set of constraints for try7:

Constraint	V52.CA	N133.CA		9.5 10.5 30.0	1
Constraint	T56.CA	V89.CA		0.0 6.0 8.09	1

Also reduced the dry weights back to standard. Running on peep.


Mon Jul  3 16:53:54 PDT 2006 Martin Madera

Try7 finished: V52-N133 is 10.48A, T56-V89 is 8.23A. There's a chain
break at V52 -- it didn't shift the loop. The packing looks better
than try4, so this is my new favourite.

For try8, I will make the T56-V89 constraint more stringent to try and
pull the helix further down:

Constraint	T56.CA	V89.CA		0.0 6.0 7.5	5

I will add another constraint to try and force the bottom helices to
move closer together:

Constraint	A60.CA	V84.CA		0.0 5.5 7.5	1

and one more to move the loop down with the helix:

Constraint	G48.CA	S190.CA		4.7 5.7	30	1

Running on peep.


Mon Jul  3 17:44:35 PDT 2006

I looked at the alignment models (as part of assembling best-models)
and I really like model 5 (which is an -A- structure). I've decided to
do a chimera of that and try2-opt2 (try2 does the best job so far of
the main domain). The work is in chimera/; the insert region is
16-110.

Try9 is polishing chimera/chimera.pdb. Running on squawk.

Mon Jul  3 19:49:18 PDT 2006 Kevin Karplus

Nope, try9 is *not* polishing chimera/chimera.pdb, as try9.under
requests, because that is not a complete conformation, and undertaker
can only optimize complete conformations.  

try8, on the other hand, does seem to be improving its cost function
on try4.

Mon Jul  3 20:00:58 PDT 2006 Kevin Karplus

Rosetta best likes repacking try4-opt2 and try6-opt2 (same score)
The try9 costfcn likes try7, try5, try8-opt1, try4, try6, try2.
The try8 costfcn likes try7, try5, try8-opt1, try4, try3, try6

There doesn't seem to be a try1, because of the undertaker bug that
caused crashes around Jun 13.  I didn't get it debugged right away,
because of the oldmacd RAID failure that day.


Mon Jul  3 20:46:48 PDT 2006 Martin Madera

Re try9 -- you're right! I thought the alignment model looked too good
to be true. Now I understand: the backbone was nice and contiguous,
but unfortunately our sequence has insertions with respect to it
(which aren't shown in the structure). OK. And when I tried a type -A-
structure in try6, it blew up.

The best structures so far are in best-models. Try7 is an improved
version of try5, no point submitting both. Try2 and try6 blew up and
are unlikely to be right; try3 is OK but *lots* of chain breaks in
rasmol.

Mon Jul  3 21:04:21 PDT 2006 Kevin Karplus

By putting the chimeras into the decoys directory, one can usually
catch the problems with cost functions that have "missing_atoms" terms.

Date: Mon, 3 Jul 2006 20:57:39 -0700
From: Kevin Karplus 
To: martin madera
CC: karplus
Subject: T0329 and other targets

For the T0329 and homologous targets that you are working on, it looks
like the t06 alignment is consistently doing better than the t04 and
t2k alignments.  You might want to do an initial run, like try1, but
with only alignments from the t06 HMMs.

One way to do this would be to put all the reasonable templates
(basically the top 10 or 20 hits in T0329.t06.best-scores.rdb) into 
MANUAL_TOP_HITS in the Makefile, do
	make extra_alignments
	make read_alignments
	foreach x  (*/read-alignments-scwrl.under)
	grep -h t06 $x > $x:s/scwrl/t06-scwrl/
	end

Then include each of the read-alignments-t06-scwrl.under files to read
in the alignments in the try.under script.

I'll do this as try10 for T0329.

------------------------------------------------------------

Mon Jul  3 21:06:34 PDT 2006 Kevin Karplus

try10 started on cheep.

Mon Jul  3 21:08:47 PDT 2006 Kevin Karplus

Ooops, restarted.  Forgot to make try10.costfcn use only
T0326.t06.dssp-ehl2.constraints.  It probably doesn't matter for
*this* target.

Mon Jul  3 21:12:21 PDT 2006 Kevin Karplus

Martin wanted to know *why* I believed that t06 was doing som much
better on this target. Good question---I may be confusing targets
(I've done that a lot today).  Let's look at some statistics:
	t06 alignment	6588 sequences 44 from pdb
	t04 alignment	3005 sequences 20 from pdb
	t2k alignment	2343 sequences 19 from pdb

OK, so t06 is more sensitive, but is the alignment better?

	t06 has 14 key residues, all of which are matched
	t04 has 14 key residues, all of which are matched
	t2k has 16 key residues, all of which are matched

Hmm---they all seem to be about equally good on these measures.

Do they choose different templates?
		Top 5 templates
	t06	2ah5A 1zrn 2gfhA 2fdrA 1te2A
	t04	1jud 2ah5A 2gfhA 2fdrA 1te2A
	t2k	2ah5A 2gfhA 1te2A 2fdrA 2go7A

There are some slight differences (probably due to the differences in
the multiple alignments as the extra hits are in the *smaller*
libraries). 

OK, I have no reason at all to believe that T06 will do better than
the others on this target.  I must have been thinking of some other target.

Oh well, I might as well let try10 run anyway.


Mon Jul  3 21:33:01 PDT 2006 Martin Madera

Try11: copied from try6. This is another attempt to get an -A- structure
out of undertaker, somehow. The structures I'm interested in are:

2ah5A|2go7A|1o08A|2fi1A

Kevin already added all the structures to MANUAL_TOP_HITS as part of
try10, so all I did was modify try11.under to:

InfilePrefix 2ah5A/
include read-alignments-scwrl.under
InfilePrefix 2go7A/
include read-alignments-scwrl.under
InfilePrefix 1o08A/
include read-alignments-scwrl.under
InfilePrefix 2fi1A/
include read-alignments-scwrl.under

for the last TryAllAlign. We'll see if this helps; if not, I'll increase
the dry terms. Running on squawk.


Mon Jul  3 22:26:28 PDT 2006

Try8 finished. The constraints were:

Constraint      V52.CA  N133.CA         9.5 10.5 30.0   1	10.49 / 7.67
Constraint      T56.CA  V89.CA          0.0 6.0 8.09    1	8.24 / 7.96
Constraint      A60.CA  V84.CA          0.0 5.5 7.5     1	8.66 / 8.35
Constraint      G48.CA  S190.CA         4.7 5.7 30      1	4.66 / 4.91

and I added the actual distances for try7 / try8 (taken from
try8-opt2.constraints for try8 and rasmol for try7).

For the first constraint, try7 has a break, so that doesn't count;
otherwise it is an improvement.

Try12: same as try8, but

- bumping up the constraint weights to 5 (from 1),
- increasing constraints to 40 (from 20),
- removing the helix constraints in the region of the insertion (16-109)


Mon Jul  3 22:45:27 PDT 2006 Kevin Karplus

Martin, I'm confused about what you just said.
Are you saying that try8 is an improvement over try7 or not?
Should it replace try7 in superimpose-best.under?

try7-opt2 does score better with the try8 costfcn.  Does that mean anything?
try7-opt2 also scores best with the try12 costfcn.

Is there a reason why try12 starts only from try4-opt2, and not from
all existing models?


Mon Jul  3 22:55:34 PDT 2006 Martin Madera

Try8 looks better than try7, but scores worse. However, try7 is
cheating (on try8's cost function) -- it has a massive break that
allows it to score better on an important constraint (the first
one); note that try7 does worse on breaks.

Because it scores worse, I thought it wasn't worth a resubmission. But
now I realize that you haven't submitted it yet! Updated best-models.

Re starting from try4, hmm... I'd like to avoid accumulated drift from
the alignment structures. 

Also, the breakdown of the various attempts is as follows:

failure: try9
blow up: try2, try6
breaks: try3
bad loop: try5
ok: try4, try7

(haven't looked at try10 yet) -- so really the only models that are
decent are try5 (but I'd prefer undertaker not seeing the loop) and
try7 -- but then I'm trying to move it further from try4 than try7 is.

Mon Jul  3 23:14:08 PDT 2006 Kevin Karplus

try10 scores very well with the unconstrained cost fcn, and it has a
rather different structure for the helical domain.  I'd like to
include it in the mix---what order?  I think that try8-opt and
try4-opt2 are so close to each other that we could probably drop try4 out.

I'll do a tentative submission of
	try8-opt2
	try10-opt2
	try5-opt2
	align5	t06 1swvA
	align2	t04 2gfhA

No---try5 is too close to try8 also.
Do we have a *different* model worth including?


Mon Jul  3 23:21:59 PDT 2006 Martin Madera

No, all of my attempts so far have been elaborations of try4.

I'd do --

	try8-opt2
	try10-opt2
	chimera.pdb (but that may be close to try10?)
	model 5
	model 2 

SORRY being stupid:

-A- ... chimera
elab -A- ... all my successful tries, model 5
-B- ... model 2, try10

so 

	try8-opt2
	try10-opt2
	chimera.pdb
	model 5
	model 2 

is a good list.

Mon Jul  3 23:31:08 PDT 2006 Kevin Karplus

I'm confused again---I thought that the chimera had an unfixable
insertion in it, which is why try9 blew up.

I'm starting a try13, which will try polishing (no constraints) from
the gromacs models.  I suspect that it will concentrate on try8-opt2,
but I could be surprised.

Mon Jul  3 23:38:00 PDT 2006 Kevin Karplus

Martin, are you aware that try11 is using all the alignments (from all-align.a2m)?
Also, you did not save the try11.under file before running try11, so
what you are running read only one of the read-alignments-scwrl files.

try13 does indeed seem to concentrate on improving try8-opt2.gromacs.
Perhaps I should do another run just from try10-opt2.gromacs.


Mon Jul  3 23:42:56 PDT 2006 Martin Madera

Well spotted re try11 -- I noticed that I had an open window with an
unsaved (and edited) try11.under file! So I restarted the try11 run
and was about to write it in the README. (And indeed, it only read one
of the four read-alignments-scwrl files.)

As far as I can see, try11 should now be using edit2.all-align.a2m,
not all-align.a2m.

Thanks for the gromacs run, I was about to do something similar myself.

I think try10 is a blind alley -- look at the actual structure! It may
score well but it looks wrong. (But then again, who knows.)

Mon Jul  3 23:51:13 PDT 2006 Kevin Karplus

I started try14 to optimize try10-opt2.gromacs0  

I don't see what is so terrible about try10-opt2.  There is a large
cavity, but there is a large cavity in some of the templates also.
If you use the T0329.t06.str2-color.rasmol script, you'll see that the
secondary structure prediction (from that network) matches pretty well.

I have made a submission with comment
	

    For T0329 (as with the several homologous targets), the alpha/beta
    domain was easily modeled, but we had two main choices for the helical
    domain (arbitrarily A and B).

    Model 1 is try8-opt2, our current best for domain type A.

    Model 2 is try10-opt2, our current best for domain type B.

    Model 3 is try4-opt2.repack-nonPC, the backbone of try4-opt2, with
	    sidechains repacked by rosetta.  It is the repacking that rosetta
	    likes best. (Try4-opt2 was optimized to form model 1)

    Model 4 is sidechain replacement by SCWRL on an alignment to 1swvA
	    (type A).

    Model 5 is sidechain replacement by SCWRL on an alignment to 2gfhA
	    (type B).

	
Tue Jul  4 08:15:09 PDT 2006 Kevin Karplus

try11-try14 have now finished.

With the unconstrained or try13=try14 costfcns, try14-opt2 (based on
try10-opt2) scores best.  try10-opt2 is next with unconstrained, but
try13-opt2 is next with try13=try14.

With the try12 costfcn, the order is try12, try7, try3, try13, try8.
The question now is whether the constraints satisfied by try12 should
outweigh the breaks and clashes.

With the try11 costcn, the order is try14, try13, try11, try10, try7, try8
(try11 is optimized from alignments, particularly 2ah5A).

Rosetta best likes repacking try13-opt2, but try14-opt2 also beats the
old best (try4-opt2).

I did not much care for either try11 or try12---the helices have been
unwound or torn apart.

I think we should now submit
	try14-opt2
	try13-opt2
	something (try4-opt2.repack-nonPC?)
	align1 t04 2ah5A
	align5 t06 1swvA

Martin, please tell me what to use for the 3rd model.
Also tell me if there is some reason to prefer align2 (2ghfA) to
align1 (2ah5A).

Since the soft deadline is noon today, I'll submit the list above
now.

Tue Jul  4 08:41:39 PDT 2006 Kevin Karplus

So submitted.


Tue Jul  4 14:30:28 PDT 2006 Martin Madera

Oops, slept longer than I had intended. But full of energy now!

First, let me make sense of the runs.

Try11: I started that one. It's an -A- structure, which have proved
very difficult with this target [see the history for try4 (which
produced elab-A-), try6 and try9]. And it's the best -A- structure we
have so far; the bulk of it is well packed, it's just that front helix
that's screwed up. (Expected: that's where we have a large insertion
wrt type -A- structures.) I think playing around with -A- structures
is important, because remember they're our top BLAST hits:

PDB     BLAST E Classification & notes
2ah5A   4.1e-10 -A-
2go7A   1.3e-08 -A-
1o08A   1.2e-06 -A-
1rqlA   6.8e-05 elaborated -A- (extra small helix)
1swvA   1.2e-04 elaborated -A- (RMSD 0.7A with 1rqlA above)
2fi1A   2.6e-04 -A-
1fezA   2.6e-04 elaborated -A- (RMSD 0.6A with 1swvA above)
1rdfA   5.8e-04 elaborated -A-
1qq5A   0.004   -B-
2gfhA   0.006   -B-
1te2A   0.006   slightly modified -A-
1zrn    0.006   -B-
1jud    0.006   -B-
1aq6A   0.006   -B-
1qq7A   0.008   -B-

I'll try moving the helix with ProteinShop tomorrow.

Try12: I started that one. Elaborated -A-. It was a further attempt in
the try4, try7, try8, try12 series. I don't like what it did; I'll
have a proper look later, but for now try7/try8 are better.

Try13: Kevin's run, polishing try2..10-opt2.gromacs0.pdb.gz, no
constraints. As expected, resulted in an elab-A- structure. It looks
OK (about the same as try7/try8), but nothing to get too excited
about. However if it scores well, it's probably the best model in the
series.

Try14: Kevin's run, polishing try10-opt2.gromacs0.pdb.gz, no
constraints. Looks very similar to try10, but I'm sure the devil is in
the detail.


I agree with the list that Kevin submitted, those are the best
structure we have so far.

But I think try11 shows a lot of promise. Apart from the helix that's
sticking out (our insertion wrt the classic -A- structures), the
packing is the best so far. This isn't surprising, because -- to
reapeat -- those are out top BLAST hits. Kevin, could you have a look
and tell me what you think?


Tue Jul  4 16:11:48 PDT 2006 Martin Madera

**OOOPS** managed to overwrite the .under and .costfcn for try7 by doing:

-bash-3.00$ cp try5.costfcn try7.costfcn
-bash-3.00$ cp try5.under try7.under
-bash-3.00$ emacs try7.under

I thought I was in the T0330 directory, but I was still in T0329.

Tue Jul  4 16:52:37 PDT 2006 Kevin Karplus

try7.under and try7.costfcn restored from my home computer.

I find that using emacs to do the copies (in the directory listing,
using 'C') helps avoid overwriting, since it asks if you intend to overwrite.
Even better, I go to the file try7.under in emacs, then insert file try5.under.
If try7.under already exists, I see it before I do the copy.


Tue Jul  4 16:56:55 PDT 2006 Kevin Karplus

I'm still not impressed with try11, but if you can clean up the
helices that are disordered, I'm certainly willing to include it.

I wouldn't put too much faith in the HMMs for this region---they are
recognizing mainly the other domain.


Tue Jul  4 17:35:18 PDT 2006 Martin Madera

Thanks for those try7 files! Re try11, do

restrict 1-64,90-239

I think that part looks very good. The problem is what to about

select 65-89
color red
select *


Tue Jul  4 17:44:34 PDT 2006 Martin Madera

So, what went wrong with try12? The constraints were:

Constraint      G48.CA  S190.CA         4.7 5.7 30      5	8.58
Constraint      V52.CA  N133.CA         9.5 10.5 30.0   5	10.7
Constraint      T56.CA  V89.CA          0.0 6.0 8.09    5	6.0
Constraint      A60.CA  V84.CA          0.0 5.5 7.5     5	5.5

and I appended the actual distances (taken from
try12-opt2.constraints; checked rasmol, rasmol agrees).

The first two, G48-S190 and V52-N133, were an attempt to push the loop
further down. And they backfired, because it pushed the loop to the
left instead. Maybe "pushing away" is a bad idea, because there are
lots of directions in which this can be done.

The second two, T56-V89 and A60-V84, were an attempt to slide the
helix further down. And they worked (albeit at a cost of introducing
breaks in the chain around T56 and A60). Unfortunately this made the
cavity on the other side of the helix even worse. I think I need to
add in more constraints to try and close it.

Wed Jul  5 06:48:14 PDT 2006 Kevin Karplus

I've never had much luck with "keep-away" constraints.  As you say,
there are so many directions to move in.

Strong constraints are a bit dangerous, as they can override all the
terms that are trying to keep the model protein-like.

If you have two models that each have good parts, you can sometimes
make a chimera by superimposing them on a shared section, then doing
cut and paste with the editor.  Are there any models that have 65-89
that are compatible with try11 that you could paste in?  It looks to me
like the spacing may make it difficult to fill in the helices.

Mon Aug 21 15:33:32 PDT 2006 Kevin Karplus

The outer domain is well modeled, but we had some trouble with the
inserted helical domain.  Our best model is try11-opt2.gromacs0,
followed closely by try11-opt2.  It looks like Martin's intuition that
try11 looked good was better than mine.

Our best submitted model was model2 (try13-opt2), which was somewhat worse
than try11-opt2.

This target was an outlier, because the align1 GDT (64.64%) was much
better than the model1 GDT (46.76%).  Note that the model 2 GDT of
60.36% was still not as good as align1, and even our best model
(try11-opt2.gromacs0) still only had a GDT of 64.64%.