Wed Jun 14 09:11:07 PDT 2006
T0330
Make started Wed Jun 14 09:11:50 PDT 2006
Running on lopez.cse.ucsc.edu

Wed Jun 14 09:13:20 PDT 2006 Kevin Karplus

BLAST gets a moderate hit on 2ah5A (23% over 211 residues, 5.5e-04)


Wed Jun 14 10:39:36 PDT 2006 Kevin Karplus
2ah5A is getting excellent scores with the HMMs also.

Wed Jun 14 15:07:14 PDT 2006 Kevin Karplus

2ah5A is our top hit, but many c.108.1.* templates score well.
There seems to be general agreement on the outer domain, but the
insertion L15-L94 seems to vary more.  This may just be hinging motion
between the two domains, or it may be higher variability in the
inserted region.


Wed Jun 28 18:32:52 PDT 2006 Martin Madera

This target is very similar to T0324, which I'm also doing. See the
T0324 README first, because the problem seems to be the same.

In a nutshell, try1-opt2 is again based on 1x42, which is wrong. The
top BLAST hits have a conserved crevice which undertaker doesn't like,
so it uses a more distant template (1x42) which closes it somewhat,
and undertaker further modifies it to fill it up as much as possible.

All the machines are busy at the moment, so I'll wait with try2 till
later on tonight.


Wed Jun 28 21:25:46 PDT 2006 Martin Madera

Hmmm, but looking at the HMM alignments using:

less */T0330-*-t2k-local-adpstyle1.a2m

the following look the best:

>1aq6A
-MIKAVVFDAYGTLFDVQSVADATERAYpgRGEYITQVWRQKQLEYSWLR
ALMGRYADFWGVTREALAYTLGTLGLEPDESFLADMAQAYNRLTPYPDAA
QCLAELAP---LKRAILSNGAPDMLQALVANAGLTDSFDAVISVDAKRVF

>1jud
mdy--IKGIAFDLYGTLFDVHSVVGRCDEAFpgRGREISALWRQKQLEYT
WLRSLMNRYVNFQQATEDALRFTCRHLGLDLDARTRSTLCDAYLRLAPFS
EVPDSLRELKR-RGLKLAILSNGSPQSIDAVVSHAGLRDGFDHLLSVDPV

>1lvhA
-MFKAVLFDLDGVITDTAEYHFRAWKALAEEIGINGVDRQFNEQLKGVSR
EDSLQKILDLADKKVSAEEFKELAKRKNDNYVKMIQDVSPADVYPGILQL
LKDLRS-NKIKIALASAS--KNGPFLLERMNLTGYFDAIADPAEVAASKP

>1o08A
-MFKAVLFDLDGVITDTAEYHFRAWKALAEEIGINGVDRQFNEQLKGVSR
EDSLQKILDLADKKVSAEEFKELAKRKNDNYVKMIQDVSPADVYPGILQL
LKDLRS-NKIKIALASAS--KNGPFLLERMNLTGYFDAIADPAEVAASKP

>1qq5A
-MIKAVVFDAYGTLFDVQSVADATERAYpgRGEYITQVWRQKQLEYSWLR
ALMGRYADFWSVTREALAYTLGTLGLEPDESFLADMAQAYNRLTPYPDAA
QCLAELAP---LKRAILSNGAPDMLQALVANAGLTDSFDAVISVDAKRVF

>1qq7A
-MIKAVVFDAYGTLFDVQSVADATERAYpgRGEYITQVWRQKQLEYSWLR
ALMGRYADFWSVTREALAYTLGTLGLEPDESFLADMAQAYNRLTPYPDAA
QCLAELAP---LKRAILSNGAPDMLQALVANAGLTDSFDAVISVDAKRVF

>1zrn
mdy--IKGIAFDLYGTLFDVHSVVGRCDEAFpgRGREISALWRQKQLEYT
WLRSLMNRYVNFQQATEDALRFTCRHLGLDLDARTRSTLCDAYLRLAPFS
EVPDSLRELKR-RGLKLAILSNGSPQSIDAVVSHAGLRDGFDHLLSVDPV

... the insertion is 16-92. This is a different set from what I used
for T0324, and moreover doesn't agree with the BLAST hits (where 2ah5A
and 2fdrA are the top two).

Interestingly, 1lvhA and 1o08A -- the two best matches, no insertions
or deletions in our region -- have one type of structure, let's call
it -A-, the rest have a different structure, let's call it -B-.

Importantly, -A- is basically the same as the structure of 2ah5A and
2fdrA (the top BLAST hits) -- there's an insertion of 7 residues wrt
2ah5A which extends two adjacent helices and moves the loop between
them, but the rest of the structure is virtually identical. (The
superpositions are in T0324/align.)

All other structures have that short 'pg' insertion, and -B- is
basically the same as 1x42 of T0324 fame. Most of them are dimers, and
those that aren't look like they should be.


Hmm... but this is irrelevant, really, as I want to edit the .under
files, which contain the following structures one or more times:

1jud	19	-B-
1l7mA	7	strange structure, another insertion in the main domain
1rdfA	9	strange structure
1rqlA	77	-A-
1swvA	96	-A-
1te2A	443	-A-
1x42A	655	-B-
1zrn	6	-B-
2ah5A	311	-A-
2fdrA	232	-A-
2fi1A	8	-A-
2gfhA	524	-B-
2go7A	88	-A-

... the numbers indicate positions in the T06, T04 and T2k
alignments. Basically -A- predominates, but lengthwise -B- is also
plausible and makes a reasonable showing. Sigh.

I think I'll do two runs, one based on the -A- structures, one based
on the -B- structures, and we'll see how they work out.

Try2 = -A- = 1rqlA|1swvA|1te2A|2ah5A|2fdrA|2fi1A|2go7A
Try3 = -B- = 1jud|1x42A|1zrn|2gfhA

Sun Jul  2 12:11:52 PDT 2006 Kevin Karplus

For some reason, try3 did not do the repacking (perhaps try2 deleted
the repack.res file that try3 needed), so I remade try3, which just
made the missing try3-opt2.repack-nonPC model and rescored.

I also scored the existing models with unconstrained.costfcn.
It is looking like try2 is being more successful than try3 in finding models.

Sun Jul  2 12:16:53 PDT 2006 Kevin Karplus

Part of the problem may be that try3 was given very few alignments to try.
It might be a good idea to put all the interesting chains in
MANUAL_TOP_HITS in the Makefile (before the include) then do
	make extra_alignments
	make read_alignments
and use the 
InfilePrefix 1xxxX/
   include read-alignments-scwrl.under

in the try*.under file.


Sun Jul  2 20:00:04 PDT 2006 Martin Madera

I agree that try3 is a disaster. Try2 looks ok; it's roughly the
structure I want, but compared to what T0324 looked like, the
insertion isn't very well packed.

I will run the insertion by itself. I'm not sure it's going to work
(it's short, so the iterative procedure may not work), but it's worth
a try. If the HMM works, it would stop undertaker from trying to pull
the insertion towards the main domain.

So: 

1) edit Makefile: first 15, last 93 => length 79

2) make subdomain ... worked! L15-L93

3) (make -k >& subdomain.log; gzip -9f subdomain.log) & 

Running on orcas. I can see a blastpgp process, so it's doing something sensible.


Sun Jul  2 21:51:05 PDT 2006 Martin Madera

Problem #1: it got stuck asking me for a password to apache. So I typed it in, but then

Problem #2:

WARNING: This version of the Acrobat Distiller may not be fully
         compatible with previous versions of Acrobat Exchange
         and Acrobat Reader due to new functionality based on
         recent PDF language additions.  To ensure compatibility
         with previous versions of Acrobat Exchange and Acrobat
         Reader, the DEFAULT for compatibility can be set inside
         your personal preference file.  Setting this compatibility
         switch, however, will result in the disabling of new
         features of the Acrobat Distiller.  The compatibility
         switch may easily be overridden at run-time through the
         specification of -compatlevel and in no way represents an
         actual permanent loss of features for Acrobat Distiller.

   NOTE: Running the Acrobat Distiller with the -noprefs option
         will effectively disable the preference chosen below.

   Enter Acrobat Distiller personal preferences file modification [1, 2, 3]
      [1] use 3.0 new features
      [2] use 2.1 compatibility
      [3] leave compatibility undefined   Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter response [1|2|3]  Invalid response. Please reenter...

OK. So need to set up ssh... done. And figure out what's going on with
Acrobat Distiller. Ssh'd to apache, ran

distill T0330.t06.w0.5-logo.eps

chose [3]... and it worked. Then ran distill again, and this time it
didn't ask. GOOD. Problem solved.


Mon Jul  3 01:24:38 PDT 2006 Martin Madera

The subdomain run finished. There were errors, so I ran it again, and
there were still errors. No try1 PDB structures got generated, which is a pity.

However, looking at the HMM hits, it would have been a waste of time
anyway. The iterative procedures didn't really work, and to my
surprise the template libraries didn't pick it up either! Well, was worth a try.

--

Try4: following Kevin's suggestion on MANUAL_TOP_HITS, added

MANUAL_TOP_HITS:= 1rqlA 1swvA 1te2A 2ah5A 2fdrA 2fi1A 2go7A 1jud 1x42A 1zrn 2gfhA

to the Makefile, ran 

make extra_alignments
make read_alignments

and started try4 (copied from try3.under) with the following extra
lines in try4.under:

InfilePrefix 1jud/
include read-alignments-scwrl.under
InfilePrefix 1x42A/
include read-alignments-scwrl.under
InfilePrefix 1zrn/
include read-alignments-scwrl.under
InfilePrefix 2gfhA/
include read-alignments-scwrl.under

(just after the lines taken from all-align.a2m).

Also changed the scoring function to:

    dry5 30 dry6.5 30 dry8 20 dry12 5

(instead of:

    dry5 15 dry6.5 20 dry8 15 dry12 5 )

to make it slightly more compact. Running on orcas.


Tue Jul  4 00:19:58 PDT 2006 Martin Madera

Try4 has done the trick for the -B- domains. I'll do two polishing
runs based on try2 and try4 and call it a day.

Try5: polishing try2-opt2. Break, dry5 dry6.5 x2. Running on lopez.

Try6: polishing try4-opt2. Break, dry5 dry6.5 as for try5. Running on pyro.


Tue Jul  4 08:49:40 PDT 2006 Kevin Karplus

To do a real "polishing" run, you need to turn up clashes as well as
breaks, and turn constraints way down or off.  remaining clashes are
still a bit too large.

Please try to have something ready by 8pm, as I'd like to get some
sleep tonight.


Tue Jul  4 15:26:10 PDT 2006 Martin Madera

Maybe "polishing" was the wrong word for try5/try6, I meant "improve
upon". I wanted to get a more compact structure first, before doing
the final polishing runs (along the lines of what Kevin did yesterday
for T0329).

Try5/try6 aren't much better than try2/try4. Hmmm. Out of T0324, T0329
and T0330, this (T0330) is the most difficult one:

target	best BLAST E-value
T0329	4.1e-10
T0324	1.8e-08
T0330	5.5e-04

so it's not surprising that it looks the worst so far. I don't see an
obvious way of improving try6. For try5 I have an idea: try and push
G50 closer towards R83 in an attempt to close the cavity.

So try7: start from the gromacs versions of try5 and try2, with

Constraint      G50.CA  R83.CA         0.0 7.0 9.0    5

DryXX and break back to standard, constraints up x3 to 30. Running on
peep.

Now two real polishing runs (based on Kevin's try13/try14 from T0329),
so that we have something decent to submit for the soft deadline:

try8: polishing try2,5-gromacs ... running on squawk
try9: polishing try4,6-gromacs ... running on pyro.


Tue Jul  4 16:54:01 PDT 2006 Martin Madera

Noticed errors in try7.log: had 

ReadConformPDB TT0330

... note the double T! Fixed, restarted.


Tue Jul  4 17:15:49 PDT 2006 Martin Madera

Polishing is all very well, but the fact is that so far the structures
for the inserted domains aren't very good. Try7 is an attempt to do
something about this, but we need more. Any ideas? 

I am tempted to do something with constraints, to keep an open
crevice, like what I did for try12 in T0329. Except that was a
disaster, so I really need to go back and understand what went wrong.


Tue Jul  4 18:54:27 PDT 2006 Martin Madera

Try7 and try8 have finished. 

I can't really see any difference between try8 and try5, but I guess
the devil's in the detail.

Try7: the constraint was:

Constraint      G50.CA  R83.CA         0.0 7.0 9.0    5

and the actual distance is... 7.0A! And, wow, look at the packing
compared to try5!!! My only complaint is that I should have pulled it
in even closer.

According to the unconstrained cost function, try7 beats try5 by quite
a margin. I'll start a polishing run based on try7 (the gromacs
version), but it's likely to take a few hours. I will also do a repeat
of try7, trying to pull it in even close.

I've updated best-models based on what we've got so far.


Tue Jul  4 20:06:49 PDT 2006 Martin Madera

Try9 finished. According to the unconstrained cost function it's a
minor improvement on try6, but Rosetta seems to like it. Updated best-models.


Tue Jul  4 20:21:28 PDT 2006 Martin Madera

While looking at best-models, in particular the model from alignments,
I got the following idea for a constraint:

Constraint      A51.CA  D72.CA         0.0 6.3 8.3    5

Try10: like try7, but with the above constraint rather than the try7
one, and reading in try2,5,7-opt2.gromacs. Running on peep.

Try11: polishing try7-opt2.gromacs. Running on lopez.

Try12: polishing try7-opt2. Running on squawk.

Tue Jul  4 23:04:52 PDT 2006 Kevin Karplus

try10 and try11 have finished.  try12 is still running---perhaps
squawk is a slightly slower machine?

With unconstrained costfcn, best models are
	try11=opt2, try12-opt1, try7-opt2, try10-opt2

(I think that try12-opt2 will be best when try12 finishes.)

try11-opt2 is best with the try11 costfcn (which has no constraints),
	followed by try8-opt2, try10-opt2, try12-opt1, try7-opt2

try10.costfcn (with a hefty constraints) orders them
	try10-opt2, try8-opt2, try5-opt2, try2-opt2

try11, try12, and try7 are all *very* similar, but try10 is a bit
different---it looks more like try5.


Tue Jul  4 23:37:50 PDT 2006 Kevin Karplus

preliminary submission with comment:

    Model 1 is try11-opt2, our best-scoring model at the moment.

    Model 2 is try12-opt1, a very similar model.

    Model 3 is try10-opt2, optimized from try5-opt2.gromacs0, a slightly
	    different helical domain.

    Model 4 is T0330.try9-opt2.gromacs0.repack-nonPC, an alternative
	    helical domain, and the model that rosetta likes best after repacking.

    Model 5 is sidechain replacement by SCWRL on a backbone from alignment
	    to 2ah5A.


Fri Jul  7 14:37:59 PDT 2006 Martin Madera

A quick summary of the models to remind myself where things stand on this target.

try1:  -B- structure; I thought it was a failure at the time but it's actually OK
try3:  attempt at -B-, blew up
try4:  -B- structure
try6:  re-running try4 with higher breaks & dry
try9:  polishing try2,6 (but only using try6)

try2:  -A- structure
try5:  re-running try2 with higher breaks & dry
try8:  polishing try2,5

try7:  from the Gromacs versions of try2,5, a constraint that closed the cavity
try10: a different constraint, trying to improve Gromacs versions of try2,5,7
try11: polishing Gromacs version of try7
try12: polishing try7

Try10 looks worse than try7, and try7 (and later models based on it)
look very good.

I don't think there's much else we can do on this target, so I'll call
it a day. I agree with Kevin's choice of models for the preliminary
submission.


Fri Jul  7 16:58:21 PDT 2006 Martin Madera

A good idea gleaned from T0303: try polishing the server models.

Looking at decoys/score-all+servers.unconstrained.pretty, the top models are:

SAM_T06_server_TS1
Pmodeller6_TS1
ROBETTA_TS1
  :
BayesHH_TS1-scwrl
HHpred3_TS1-scwrl
Zhang-Server_TS1

3Dpro didn't do so well. So I'll base it on Pmodeller6 & Robetta,
copying T0303/try3.under. This will be try13. Running on peep.


Fri Jul  7 20:25:19 PDT 2006 Martin Madera

Try13 finished. According to the unconstrainted cost fuction it
doesn't score very well, so I don't think we should include it in the
submission.

Fri Jul  7 23:43:30 PDT 2006 Kevin Karplus

I see nothing in here to indicate the desire for a new final submission.
I'll let the preliminary submission ride, unless Martin sends me email.