Thu Jun  1 09:05:03 PDT 2006
T0309

Make started Thu Jun  1 09:21:24 PDT 2006
Running on lopez.cse.ucsc.edu

Thu Jun  1 09:30:06 PDT 2006 Kevin Karplus

BLAST finds modest hits
	2bdqA (93% id over 14 residues)
	1tf7A (31% over 77 residues---but there are only 76 residues!!)
	2gxfA, 2bdtA, 2afcA, 1xklA (100% id over 13 residues)
The short matches seem to be to the HIS tag---not real useful.

This looks like a fairly easy fold-recognition though, as the 1tf7A
hit is essentially full length.

The t06 multiple alignment finds nothing but the B.subtilis protein
itself (minus the 13-residue HIS tag: MAGDPLEHHHHHH), so the BLAST hit
was too distant to be found in NR.

Thu Jun  1 21:40:03 PDT 2006 Kevin Karplus

I don't like the structure in try1-opt2, as it has used the HIS tag as
part of a sheet.  I think we should try predicting a subdomain,
M1-E63, which excludes the HIS tag.

Fri Jun  2 08:07:50 PDT 2006 Kevin Karplus

The M1-E63 model seems a bit foamy, but otherwise ok.  We may need to
tuck F42 into the interior.  Upping the dry packing terms (including
phobic_fit) may help pack this tighter.  We might also want to try
including a couple of the linker residues from the HIS tag, so that we
have better secondary structure prediction for the KGVE residues at
the end of the native protein.

Sat Jun 17 02:04:45 PDT 2006 Kevin Karplus

The M1-G66/try1-opt2 is quite similar to the M1-E63/try1-opt2, but the
G66 is buried inside, making it inaccessible for adding the HIS tag.
There is also a difference is alignment of the sheet:

# from M1-G66/decoys/try1-opt2.sheets
SheetConstraint 	F14 	F15	T50 	I49	hbond 	F14	1

# from M1-E63/decoys/try1-opt2.sheets
SheetConstraint 	N10 	F15	K52 	V47	hbond 	K12	1


I tried making a chimera of M1-G66 try1-opt2 and undertaker-align
model2 (which has the HIS tag from D67 on in a reasonable place).
This is chimera1 with the crossover between G66 and D67.
There is a bad break, but we may be able to fix that.

I also made chimera2 from M1-G66 try1-opt2 and
undertaker-align crossing over between V62 and E63.

I also made chimera3 from M1-E63 try1-opt2 and
undertaker-align crossing over between V62 and E63.

try2 & try3.costfcn uses the sheet constraints from M1-G66
try4.costfcn uses the sheet constraints from M1-E63

Try2 will optimize chimera1	(on lopez)
Try3 will optimize chimera2	(on shaw)
Try4 will optimize chimera3	(on shaw)

Sat Jun 17 02:43:47 PDT 2006 Kevin Karplus

It looks like the gaps will be closed fairly quickly in all three
runs, and the final models will be at least as good as the try1-opt2
model that buried the HIS TAG in the sheet.

Sat Jun 17 08:12:57 PDT 2006 Kevin Karplus

Although try3-opt2 scores better than try4-opt2, I like the sheets of
try4-opt2 better.  We might be able to improve the packing of try4 by
adding Hbond E63.N P58.O, which seems to be trying to form.

The break before E63 did not close in try4-opt2, because V62 is buried
too deep.  Perhaps another chimera---one with the guts of try4 but the
N- and c-termini from try3-opt2?   Hmm, I'm not sure whether I like the
N-terminus from try3 or try4 better---try3 buries M1, but try4
solvates S3.OG.  The burial of M1 results in some pretty bad clashes,
so maybe I should stick with try4's N-terminus.  Actually, try2's
N-terminus is also a reasonable option.

Take M1-V6 from try2, H7-K60 from try4, and G61-H76 from try3.  This
chimera (chimera4) does not have the possible E63.N P58.O Hbond.  In
fact, E63 looks too buried.  Perhaps optimization will free that up a
bit.  

Sat Jun 17 08:55:13 PDT 2006 Kevin Karplus

optimizing chimera3 and chimera4 as try5 on shaw

Sat Jun 17 09:21:15 PDT 2006 Kevin Karplus

Interestingly, chimera3 seems to have been the model picked by try5 to improve.
I'll start a try6 with the same costfcn as try5, but with just
chimera4 as a starting point.

Sat Jun 17 09:23:48 PDT 2006 Kevin Karplus

try6 started on lopez

Sat Jun 17 10:18:26 PDT 2006 Kevin Karplus

history to keep track of where models came from:

M1-G66 -> chimera1, chimera2 -> try2, try3
M1-E63 -> chimera3 -> try4, try5
		try4 -> chimera4 -> try6

Sat Jun 17 10:30:21 PDT 2006 Kevin Karplus

try6-opt2 and try5-opt2 are the two top-scoring models with try6
(=try5=try4) costfcn.  They are also the best with the unconstrained
costfcn. 

Both undertaker and rosetta prefer try6 to try5.

I'm reasonably happy with both.  I'll do a polishing run with breaks
and clashes turned up, starting from all our models.

Sat Jun 17 10:44:49 PDT 2006 Kevin Karplus

try7 started on lopez

Sat Jun 17 13:03:02 PDT 2006 Kevin Karplus

try7-opt2 is best scoring with try7 costfcn, and rosetta likes
repacking it best of any of the models.  It seems to be based on try6,
though there may have been some crossover.


I think we've reached the point of dimishing returns here.  I'll submit
ReadConformPDB T0309.try7-opt2.pdb
ReadConformPDB T0309.try5-opt2.pdb
ReadConformPDB T0309.try3-opt2.pdb
ReadConformPDB T0309.try4-opt2.pdb
ReadConformPDB T0309.try2-opt2.pdb

History to keep track of where models came from:

M1-G66 -> chimera1-> try2
M1-G66 -> chimera2 -> try3
M1-E63 -> chimera3 -> try4, try5
		try4 -> chimera4 -> try6 -> try7

Mon Jul  3 12:59:49 PDT 2006 Kevin Karplus

It may be worthwhile to do a polishing run with breaks and clashes
turned up starting only from the gromacs models (to escape from local minima).

try8 started on farm cluster.

Mon Jul  3 16:47:26 PDT 2006 Kevin Karplus

try8 greatly reduced breaks and clashes, but is still pretty foamy.
Rosetta like it best of any of the backbones for repacking.


Mon Jul  3 17:01:47 PDT 2006 Kevin Karplus

Trying one more polishing run from all the gromacs optimized models,
in the hope of beating try8-opt2.  After that I should probably do a
polishing run from all models.

(try9 started on farm cluster)

Tue Jul  4 15:06:48 PDT 2006 Kevin Karplus

For some reason, gromacs is not running on the farm cluster, so I
reran it on cheep.

I also tried making try8-opt2.gromacs0.repack-nonPC (and try9...).

Currently, the best score with try9.costcn is T0309.try9-opt1 (not
opt2, as there are some larger breaks in try2, apparently).

Rosetta likes best 
	decoys/T0309.try9-opt1.gromacs0.repack-nonPC.pdb.gz
	decoys/T0309.try8-opt2.gromacs0.repack-nonPC.pdb.gz
	decoys/T0309.try9-opt2.gromacs0.repack-nonPC.pdb.gz

Tue Jul  4 15:27:34 PDT 2006 Kevin Karplus

I'll do one more polishing run (try10) on the farm cluster, but I
think we've reached the point of diminishing returns---we're unlikely
to make the model better with further refinement.

Tue Jul  4 17:25:45 PDT 2006 Kevin Karplus

I will submit with the following comments:

    T0309 is an ORFan---we found no similar sequences in NR, even with our
    most-sensitive iterated searches.  This makes all of our predictions
    methods (neural nets, HMMs, ...) much less accurate, as we have no
    signal from evolutionary sampling of the fold.

    Our first model had a serious flaw---it included the HIS tag in a sheet.
    Under the assumption that the structure is formed without the HIS tag,
    we made two subdomain predictions M1-E63 and M1-G66.  HIS tags were
    pasted onto the automatically generated subdomain models, and the
    resulting chimeras optimized.  Closing the gaps was a bit hard, as the
    subdomain models had not necessarily left the C-terminus on the surface.
    We also tried taking the sheet from one of the optimizations, and 
    adding the HIS tag from one of the others.

    history to keep track of where models came from:

    M1-G66 -> chimera1-> try2
    M1-G66 -> chimera2 -> try3
    M1-E63 -> chimera3 -> try4, try5
		    try4 -> chimera4 -> try6 -> try7

	    try5-opt.gromacs0 -> try8-opt2.gromacs0 -> try9-opt1.gromacs0.repack-nonPC -> try10

    None of the HIS tags are very convincing, but HIS tags are often
    disordered, so there is not much point in trying to optimize the
    prediction of their structure.

    Model 1 is try10-opt2 which twice had pieces from other optimizations
    stuck onto the M1-E63 base model.  It scores best with our cost
    functions.

    Model 2 is try7-opt2, optimized from chimera4

    Model 3 is try3-opt2, which uses a different alignment of the strands.

    Model 4 is try4-opt2, yet another optimization of from chimera3. It is
	    the base model which had N- and C-termini replaced to make
	    chimera4, which was optimized to form our best-scoring model.

    Model 5 is try2-opt2, which is an optimization of chimera1, based on
	    the same underlying model as try3-opt2 (model 3), but with
	    different HIS tag attached.

Wed Mar 21 20:56:24 PDT 2007 Kevin Karplus

Our best model is align2, an alignment to 2fvtA, which was better than
any of the server models in the "real_cost" function, though the large
number of missing atoms makes me suspect that this is an RMSD
artefact, and that it actually did rather poorly.

Our best complete model was try6-opt1
The best we submitted was model4=try4-opt2, which was only slightly worse.