Tue Jul  8 13:46:29 PDT 2008
TR429

The Make.main file was modified to fetch the refinement model from 
	http://predictioncenter.gc.ucdavis.edu/casp8/target.cgi?target=TR429&view=template

I trimmed the a2m file and am starting a new prediction for this subdomain.

Make started Tue Jul  8 14:01:17 PDT 2008
Running on moai08.kilokluster.ucsc.edu

Tue Jul  8 14:03:33 PDT 2008 Kevin Karplus

The CASP organizers also recommend doing subdomains:
	M1-R100	(actually E22-R100)
	A101-P176

Tue Jul  8 14:55:17 PDT 2008 Kevin Karplus

I looked for the model in the server models and our T0429
models---none came very close, with GDTs of at most 39% to TR429.

The closest models to TR429 are
	
by GDT:	RAPTOR_TS3	GS-KudlatyPred_TS5	Phragment_TS5	try1-opt3.gromacs0.pdb	COMA-M_TS1
by real_cost: Phragment_TS5	RAPTOR_TS3	try1-opt3.gromacs0.repack-nonPC	Phyre_de_novo_TS1

It looks like we did not do so well on this target (though we did
submit a try1 model as our model 5, which may not be too terrible).

I suspect that TR429 does well on the two subdomains, but still
doesn't pack them quite right against each other.

What the CASP organizers say is

	REFINEMENT TARGET TR429 (one of the best submitted
	models). MODEL GDT_TS=47. This is a two-domain protein: D1:
	1-100, D2:101-178. You can refine each domain separately (if
	desired) and then submit the refined domains in one file (as a
	model for the whole sequence). Refinement of each domain will
	be evaluated separately. Residues 1-21, 26, 55-72, 155,
	177-178 are missing in the experimental structure. Residues
	1-21 and 177-178 are cut out from the model.  

Tue Jul  8 19:02:45 PDT 2008 Kevin Karplus

TR429 scores poorly on almost all the measures, including clashes and breaks.
I'll probably have to extract constraints from it, though it doesn't
seem to have many sheet constraints (unlike the try1 models).

Tue Jul  8 19:06:27 PDT 2008 Kevin Karplus

I think that TR429 needs sheets, but doesn't quite form them.  I
should look to see what sheet constraints come closest to what it has,
and try enforcing them.

Tue Jul  8 21:04:08 PDT 2008 Kevin Karplus

I've started separate subdomain predictions.

The N-terminal domain for the initial TR432 model is compatible in its
sheets with the try1 model, so I'll probably end up using the sheet
constraints from try1-opt3 to try to clean up the original model for
the N-terminal part.

The C-terminal domain is more compatible with the alignment to 2bkdN (align2.sheets).

While waiting for the individual domains to finish, I'll do a try2 run
to remove clashes and breaks and try to form the sheets better.

Tue Jul  8 22:23:29 PDT 2008 Kevin Karplus

I may want to do another run, since try2 does not seem to be reducing
breaks as fast as I had expected.

Thu Jul 10 15:32:03 PDT 2008 Kevin Karplus

I made a chimera of try2-opt3 and R100-P176/try1-opt3:
	N-terminal region from  TR429.try2-opt3.pdb
	I105-P176 from R100-P176/try1-opt3

I'll optimize this as try3, using the same costfcn as for try2.

I'll also cut up the TR429 model and optimize the parts in the subdomains.

Thu Jul 10 15:40:38 PDT 2008 Kevin Karplus

E22-A101/try2 optimization of TR429 started.  Currently the try1-opt3
model scores much better, though it has worse breaks, so I may want to
try doing an optimization of that as well.

The bad breaks are before V75 and I76, very close to the unresolved
region 55-72.  I might try a chimera that copies 55-76 from TR429 into
try1-opt3, and optimize that.

Thu Jul 10 15:48:15 PDT 2008 Kevin Karplus

R100-P176/try2 optimization of TR429 started.  I almost certainly want
to optimize the try1 models, since the TR429 model is really awful in
the C-terminal domain.


Thu Jul 10 15:59:20 PDT 2008 Kevin Karplus

E22-A101/try3 optimization started to optimize
E22-A101/chimera-try1-init, which is mostly from the
E22-A101/try1-opt3 model, but which has K55-I76 from TR429, since the
E22-A101/try1-opt3 model had bad breaksin that region (it is not
important that we get htis region right, since it is a floppy loop
that was not resolved in the crystal, but eliminating the break there
will make it easier to optimize the rest of the domain.

Thu Jul 10 17:16:00 PDT 2008 Kevin Karplus

For the N-terminal domain, we have two main lines:
	try3-opt3.gromacs0
	N-try3 = E22-A101/try3-opt3
If doing a cut-and-past operation, V97 is a good common residue.
I may want to copy L71-I76 from try3-opt3 to the N-try3 domain, to
move the break into the loop region that we don't care about.

I think it is worth making a chimera of the try3 with the N-try3
domain, and optimizing that as try4.

Thu Jul 10 17:44:36 PDT 2008 Kevin Karplus

I made chimera-N3-try3  
  E22-A70, Y77-V97 from E22-A101/try3-opt3
  L71-I76, R98-P176 from try3-opt3.gromacs0
and am optimizing it as try4.

I'll take the C-terminal part of try3-opt3.gromacs0 and try
re-optimizing it in R100-P176.

Thu Jul 10 17:52:12 PDT 2008 Kevin Karplus

R100-P176/try3 will try optimizing the C-terminal domain of try3-opt3.gromacs0.
R100-P176/try4 will try optimizing all the models of R100-P176 (I
expect it to concentrate on R100-P176/try2-opt3, as that has much
smaller breaks than the from-try3 model.


Thu Jul 10 19:52:11 PDT 2008 Kevin Karplus

R100-P176/try4-opt3 (from R100-P176/try1-opt3) scores better than 
R100-P176/try3-opt3.gromacs0, due mainly to smaller breaks.

The C-terminal domain has two lineages of solutions:
	try4-opt3	from SAM+undertaker
	try3-opt3.gromacs0	from TR429

I'll do one more optimization with the N and C domains both coming
from the SAM lineages (try5 from chimera-try4-C4).

I'll also put together a chimera with both halves coming from the
TR429 lineage (chimera-N2-C2).

Thu Jul 10 20:42:30 PDT 2008 Kevin Karplus

For chimera-N2-C2, I ended up using a little bit of the linker from try4:
   E22-R100	E22-A101/try2-opt3
   A101-L110	try4-opt3.gromacs0
   E111-P176	R100-P176/try2-opt3

I'll optimize this as try6.

There is a little bit of sheet in the N2 model that might be worth
trying to incorporate into the other lineage also---I might want to
reoptimize with the try6 costfcn from the try5 models also, when they
are done.  Or, maybe, I should try doing another run on just the first domain.

Thu Jul 10 20:52:39 PDT 2008 Kevin Karplus

E22-R100/try4 started to try to get good sheets from both lineages.

Fri Jul 11 00:04:15 PDT 2008 Kevin Karplus

I should probably patch in the N4 domain to make another model.

I also should investigate making L27-V30 into a strand, as predicted.


Sun Jul 13 09:27:46 PDT 2008 Kevin Karplus

try7 started to optimize chimera-N4-try5.

Sun Jul 13 10:03:02 PDT 2008 Kevin Karplus

I tried looking for a model (among the server models and
SAM+undertaker models) that would pack the two barrels better, but I
didn't find one.  I don't think that I have the time or the tools to
dock the domains against each other.

Sun Jul 13 13:08:38 PDT 2008 Kevin Karplus

try7-opt3 doesn't score as well as try5-opt3, though the N4.sheets are better.
Rosetta likes try7-opt3.gromacs0.repack-nonPC best.
It might be worth doing another optimization just from the
gromacs-optimized try7 model.  Or maybe I should patch N5 into try7
and optimize that.

It's probably not worthwhile to patch in N5, as it didn't make the
extra strand that I had requested, though it may have improved a
couple of the Hbonds.  It would take a while to clean up the clashes
and breaks in the N5 domain, so I may be better of optimizing try7
without patching in N5 (I could use the sheet constraint for trying to
add the extra strand, though).

Sun Jul 13 13:25:30 PDT 2008 Kevin Karplus

try8 started from all gromacs optimized models (except the try5 ones)
to attempt improvements to try7.  The added_strand constraint set was
added to try to get a better N-terminus.

Sun Jul 13 14:40:55 PDT 2008 Kevin Karplus

try8-opt3 scores best with the try8 costfcn and
try8-opt3.gromacs0.repack-nonPC scores best with the Rosetta energy function.

Sun Jul 13 14:58:03 PDT 2008 Kevin Karplus

I'll do one more polishing run for try2, the rather awful model
polished from TR429, and use the try9 scoring to help choose which models
to submit.

Sun Jul 13 16:33:46 PDT 2008 Kevin Karplus

try9-opt3 improves on try2-opt3.gromacs0, but it will still only be
model 5.

The big question is whether try5-opt3 or
try8-opt3.gromcas0.repack-nonPC should be my number 1 model.

I like the helix at A70-D74 in try8 better, and it gets slightly
better burial (until gromacs and rosetta mess it up), but I think I'll
leave try5 first.  The differences are fairly small and probably don't
matter much.

Sun Jul 13 16:58:20 PDT 2008 Kevin Karplus

Submitted with comment:

    For a REFINEMENT model, the predictions are redone for the sequence
    included in the refinement model, then both the automatic model and
    supplied model are further optiized, initially taking sheet and helix
    constraints from the supplied model to refine.

    In the case of TR429, the supplied model is rather terrible, with bad
    breaks and clashes and incompletely formed sheets.  The domains were
    separately predicted, and predictions were pasted into the supplied model.
    Only model 5 was optimized directly from TR429, though model 4 was
    constructed from domains separately optimized from TR429.

    The other models submitted have nothing left of the original TR429
    model, except the placement of the two domains, which I believe is
    incorrect, as the hydrophobic residues I104, F166, I164 are all
    exposed---I believe they should be packed against the other barrel.

    I tried looking for a model (among the server models and
    SAM+undertaker models) that would pack the two barrels better, but I
    didn't find one.  I don't think that I have the time or the tools to
    dock the domains against each other properly.

    Model

    1 TR429.try5-opt3.pdb	# < chimera-try4-C4
	    chimera-try4-C4:
		    initially TR429.try4-opt3.gromacs0.pdb
		    L110-P176 from R100-P176/try4-opt3.gromacs0 < R100-P176/try1-opt3.gromacs0 < align(2f5kA)
	    try4-opt3 < chimera-N3-try3
	    chimera-N3-try3:
		    E22-A70, Y77-V97 from E22-A101/try3-opt3 < E22-A101/chimera-try1-init
		    L71-I76, R98-P176 from try3-opt3.gromacs0
	    try3-opt3 < chimera-try2-C1
	    chimera-try2-C1:
		    N-terminal region from  TR429.try2-opt3.pdb
		    I105-P176 from R100-P176/try1-opt3 < align(2f5kA)
	    try2-opt3 < TR429 (initial model)

	    E22-A101/chimera-try1-init:
		    mostly from E22-S101/try1-opt3 < align(1mhnA)
		    loop K55-I76 from TR429.pdb


    2 TR429.try8-opt3.gromacs0.repack-nonPC.pdb	# < try7-opt3.gromacs0 < chimera-N4-try5
	    # best Rosetta energy
	    chimera-N4-try5:
		    E22-V97 from E22-A101/try4-opt3 < try3-opt3 < E22-A101/chimera-try1-init
		    R98-P176 from TR429.try5-opt3.pdb (see model 1)

    3 TR429.try4-opt3.gromacs0.pdb	# < chimera-N3-try3

    4 TR429.try6-opt3.gromacs0.pdb	# < chimera-N2-C2
	    chimera-N2-C2:
		    E22-R100	E22-A101/try2-opt3 < TR429
		    A101-L110	try4-opt3.gromacs0
		    E111-P176	R100-P176/try2-opt3 < TR429

    5 TR429.try9-opt3.pdb	# < try2-opt3.gromacs0 < TR429

Mon Nov 10 10:44:42 PST 2008 Kevin Karplus

By GDT, model5 is the best I submitted, though model3 does better by real_cost.
GDT is not really improved by the refinement, but other real_cost
measures are.