Wed Jun  9 10:20:51 PDT 2004
T0202

DUE 6 Aug 2004

Wed Jun  9 18:04:43 PDT 2004	Kevin Karplus

This looks like a fold-recognition target with 3pfk (c.89.1.1) or
2tysB (c.79.1.1) as the best match.  It may be a new fold, but I'm
hoping for an existing one, as there are a LOT of beta strands.

try1 is astonishingly ugly---only one strand-helix-strand has formed
Hbond I43.N	P64.O	# 3.05223
Hbond I43.O	F66.N	# 2.86163
Hbond S45.N	F66.O	# 3.00959

I'll put these into try2, along with the strand constraints that were
accidentally omitted.

Hmmm, the break penalties may be a bit too high, making it difficult
to move pieces around enough.  The beta sheets for this one may need
to be constructed by hand.

Wed Jun  9 22:40:10 PDT 2004 Kevin Karplus

try2 doesn't look much better than try1.
This one will probably need hand assembly.

Thu Jun 10 09:43:00 PDT 2004	Kevin Karplus

The only thing that is coming from the initial alignments is the
strand-helix-strand of I43-G47 paired with P64-I68.  We'll have to do
the rest of the pairing manually!

We can get some of the hairpins, I think.

Fri Jun 11 15:14:26 PDT 2004	Kevin Karplus

Try4-opt2 doesn't score quite as well as try3-opt2 (even with the
try4.costfcn), but does have a number of hairpins.
Perhaps we should try doing a crossover optimization from the models
so far, and see if anything can be improved.  I'll lower break and
clash penalties a bit.  This will be try5.


Sat Jun 12 08:34:48 PDT 2004	Kevin Karplus

In try5-opt2, a beta sheet is beginning to form.  The N-terminal
hairpin belongs in it somewhere, but I'm not sure where.

K196-A201 (predicted as strand) has curled up into a helix, but it
probably belongs antiparallel to S213-D217 and to Y189-V191:

188>     pYVVsm
201<   eAIVEIKre
210>  gqKSVDFd

Sat Jun 12 21:07:13 PDT 2004	Kevin Karplus

I have some bits and pieces of beta sheet now in try6, but still a lot
more to form---this looks like it will be a slow process of guessing
sheet topologies, adding constraints, and seeing which work out.
We could run a few in parallel, but I'm having enough trouble working
on multiple targets without also working on several conjectures for
each. 

Fri Jun 25 18:30:27 PDT 2004	Kevin Karplus

Ligand news from CASP:
T0202  --  NADPH


Thu Jul 15 11:44:40 PDT 2004 Kevin Karplus

It looks the mutual-information predictions are pretty strong for this
target, so those constraints may be worth using to increase the chance
of folding things right.

I modified superimpose-best.under to create helices and sheets files
for the models (undertaker, alignment, or robetta).  It looked to me
that robetta-model1 has a bad knot, so I'm scoring them all with the
"knot" cost function also.

It seems that robetta models 1 and 5 have knots, as do the first two
models from alignments (which is probably an artifact of missing residues).


Thu Jul 15 16:54:23 PDT 2004 Kevin Karplus

try7 has some fairly good sheet fragments.  I don't like it making
E28-F30 into a strand though.


Fri Jul 16 15:02:16 PDT 2004 Kevin Karplus

Maybe I need to create a "strands" rasmol script to label the probable strands:
define s1	2-7
define s2	12-16
define s3	42-45  
define s4	65-69  
define s5	97-99  
define s6	102-107
define s7	113-119
define s8	127-134
define s9	137-143
define s10	146-150
define s11	173-178
define s12	189-191
define s13	196-201
define s14	204-208
define s15	211-215
define s16	219-224
define s17	230-232

define beta s1 or s2 or s3 or s4 or s5 or s6 or s7 or s8 or s9 or s10 or s11 or s12 or s13 or s14 or s15 or s16

define h1	21-27
define h2	50-58
define h3	81-91
define h4	158-163

define hall h1 or h2 or h3 or h4


Fri Jul 16 16:59:41 PDT 2004 Kevin Karplus

In addition to the "easy" sheet constraints we've guessed already, and
some weak constraints from George's predictions, I'm going to put in
some strong constraints to cluster the 3 conserved ASP residues (D49,
D145, and D217) assuming that they form a catalytic triad.


Fri Jul 16 19:48:24 PDT 2004 Kevin Karplus

try8 scores well under the new cost function, but the aspartic acids
have not come together yet.  
(I have a rasmol script highlighting the triad, called "triad".)

It looks like s15 should be parallel to s3 and s4 and antiparallel to
s10.  S3 and S4 are making a nice parallel connection now, so the
question is one of ordering, with 3!=6 orders:

A	s3  || s4  || s15 ^v s10
B	s3  || s4  ^v s10 ^v s15
C	s15 || s3  ||  s4 ^v s10
D	s15 ^v s10 ^v s3  || s4
E	s10 ^v s3  || s4  || s15
F	s10 ^v s15 || s3  || s4

One of the neural-net constraints (P165-I175) weakly implies that s10
|| s11---a constraint we had already put in.
If that is correct, then  we can rule out orders B and D.

We also have s15^vs16, so we can rule out A and F, leaving only C and E:

C	s17 ^v s16 ^v s15 || s3  ||  s4 ^v s10 || s11
E       s11 || s10 ^v s3  || s4  || s15 ^v s16 ^v s17

Another constraint L76-P165, implies that the helix between s4 and s5
is near the helix between s10 and s11, so I favor the C ordering.

One problem with this conjecture---s15 is almost all hydrophobic, and
s3 is almost all polar.  I'll try it anyway for try9, with increased
weight on the triad constraints also.


Fri Jul 16 22:14:32 PDT 2004 Kevin Karplus

try10 is the same as try9, but with the fixed version of undertaker
that gets the hbonds right (I hope) in the SheetConstraints.
try10 should be able to do a better job, since it has more consistent
constraints internally.


Sat Jul 17 09:57:14 PDT 2004 Kevin Karplus

try9-opt2 scores a bit better than try10-opt2, but neither one is
really great.  I'll have to spend some more time looking at the pieces
and seeing whether I can fit them together.


Sun Jul 18 12:46:55 PDT 2004 Kevin Karplus

I didn't get the time or energy yesterday to do anything with this
target.   There are very few sheets actually formed:

try8-opt2
SheetConstraint (T0202)F42 (T0202)V46	(T0202)P64 (T0202)I68	hbond (T0202)I43
SheetConstraint (T0202)D129 (T0202)V134	(T0202)I142 (T0202)V137	hbond (T0202)L132
SheetConstraint (T0202)V214 (T0202)D217	(T0202)E223 (T0202)I220	hbond (T0202)D217


try9-opt2
SheetConstraint (T0202)I43 (T0202)V46	(T0202)I65 (T0202)I68	hbond (T0202)I43
SheetConstraint (T0202)I128 (T0202)V130	(T0202)I142 (T0202)D140	hbond (T0202)V130
SheetConstraint (T0202)A131 (T0202)V134	(T0202)D140 (T0202)V137	hbond (T0202)V134

try10-opt2
SheetConstraint (T0202)I43 (T0202)V46	(T0202)I65 (T0202)I68	hbond (T0202)I43
SheetConstraint (T0202)R133 (T0202)V134	(T0202)E138 (T0202)V137	hbond (T0202)V134

The 1pfkA structure used as a template has two sheets:
The terminal sheet is s3 ^v s2 || s1 || s4 || s5  || s10 ^v s11
The middle sheet is s7 || s8 || s6 || s9.

(Number of strands is in order along chain of 1pfkA, and does not
necessarily correspond to numbering of target strands.)

Nowhere in the predicted secondary structure do we have the
strand/helix alternation to make a 4-strand parallel sheet, needed for
matching EITHER sheet of 1pfkA.  I think this is a poor match for fold recognition.

The second best domain hit was for 2tysB, which also has a 4-strand
parallel sheet in the middle but with order CBAD, rather than BCAD.
The other sheet is complicated with A ^v B || F || C || D || E, with
the domain having the 4-strand sheet inserted between B and C.
Again, I don'thave anywhere near enough helices for this fold to be useful,


For try11, I'm giving up on the templates to a large extent, and just
putting together antiparallel constraints based on adjacency in the
sequence.  I'll try adding a couple of weaker sheet constraints for
other possibilities.

try11 will read in the alignments, but not do TryAllAlign---using
fragment insertion rather than starting from a bad initial alignment.


Sun Jul 18 21:43:13 PDT 2004 Kevin Karplus

try11 is an ugly mess with many breaks.
Only s3 || s4 formed nicely.  Still, it scores much better with the
try11.costfcn than anything else we've looked at, so maybe polishing
it with bigger break penalties might produce something feasible.


Mon Jul 19 08:26:41 PDT 2004 Kevin Karplus

try12 is taking an extremely long time on whinny.  After I get out the
opt1 version, I'll kill the job, look at the result, and start a much
shorter run.


Tue Jul 20 13:57:58 PDT 2004 Kevin Karplus

I got busy and let it finish.  The results for try12-opt2 are still
very ugly, with lots of breaks.  I obviously have some (or all) of the
sheet constraints wrong.  Still, with an unconstrained cost function,
try12 scores the best of any of our models.

I'm wondering if the antiparallel action is really happening in some
sort of sandwich structure, with the turns moving to the other sheet,
rather than back to the same sheet.


Wed Jul 21 09:16:06 PDT 2004 Kevin Karplus

I'm going to try removing a lot of sheet constraints, putting in just
the ones for strongly predicted turns at
	P109-D110	S6 ^v S7
	D135-G136	S8 ^v S9
	M193-E194	S12 ^v S13
	D209-D210	S14 ^v S15
	D217-G218	S15 ^v S16

I'll also include all the strongly predicted (P>0.5) contacts from
George's "280.rr" predictions.

Before I start the run, I'll need to create a T0202.t04.many.frag
fragment file---especially since t2k and t04 disagree about the
secondary structure predictions in places.


Wed Jul 21 11:05:53 PDT 2004 Kevin Karplus

Try13 seems to be mainly polishing try9, which may not be the most
desirable thing to do.

It might be worth starting from the alignments with the same cost function.
I'll set that up as try14, but I'll change the constraint weights a
bit to put more weight on the helix and strand constraints.

Wed Jul 21 11:07:44 PDT 2004	ggshack

Just to take a look, I am starting from try13, switching to
bonus_constraint and boosting break.  Starting TRY15 on caw.


Wed Jul 21 17:49:13 PDT 2004 Kevin Karplus

Someone rebooted crow at 13:51, killing my try14 job.  Foo!  I wish
whoever did it had at least sent me a message saying they were going
to kill my job!  Since try14 produced nothing before it was killed,
I'll have to restart it somewhere.

George, just sent me a note (AGAIN FORGETTING TO PUT ANYTHING IN THE
README FILE):

    Subject: T0202.try15-opt1 is available
    Date: Wed, 21 Jul 2004 17:10:41 -0700

    This is the one I ran with bonus_constraint. I thought it would
    put the RR constraints closer but it appears it didn't, not as
    much as I thought. On the other hand, some sheets appeared (but
    not as phobic as they should be...).

    - George


Thu Jul 22 14:53:28 PDT 2004 Kevin Karplus

try14 scores best with  unconstrained.costfcn
It looks really terrible to me, with many strands wound up into helices.

With George's try15 costfcn, try15 scores best.  Some of the hairpins
look decent, but the overall sheet topology still needs work.  Still
it's better than many moels we've created for this target.

With the try14 costfcn, the best-scoring are try13 and try15.
try13 again has some decent bits of super-secondary structure in an
overall poor model.  I'd be hard pressed to choose between try13 and
try15--they both look mostly wrong.  try13 brings the rr constraints a
bit closer together, but doesn't seem to really match them.


Sat Jul 24 07:27:33 PDT 2004 Kevin Karplus

I should try SOMETHING on this target again, though I don't feel we've
made a lot of progress.

The unconstrained cost function likes try14 best, then try13, try15,
try12, try5, try9, try6, try8, try2, try1, ...

I've not been able to get agreement from the t2k and t04 multiple
alignments about what the secondary structures at the N- and
C-terminal ends are.  Most of the conserved residues agree, except for
a conserved D near the end, which t2k thinks is D217, but t04 thinks
is D209.

try14-opt2.sheets:
SheetConstraint I43 V46  	I65 I68  	hbond S45
SheetConstraint F100 V103  	C144 F147  	hbond R102
SheetConstraint P101 V103  	L171 C173  	hbond V103
SheetConstraint I117 L120  	F147 A150  	hbond I117
SheetConstraint K212 D217  	S225 I220  	hbond S213

try13-opt2.sheets:
SheetConstraint I43 V46  	I65 I68  	hbond I43
SheetConstraint I128 V130  	I142 D140  	hbond V130
SheetConstraint A131 V134  	D140 V137  	hbond L132

try15-opt2.sheets:
SheetConstraint V6 K8	  	K14 H12  	hbond V6
SheetConstraint F42 G47  	P64 N69  	hbond I43
SheetConstraint C105 M108  	L114 V111  	hbond S106
SheetConstraint D129 V134  	I142 V137  	hbond L132
SheetConstraint K212 D217  	S225 I220  	hbond D215
SheetConstraint I222 K224  	V44 V46  	hbond E223

try12-opt2.sheets:
SheetConstraint V44 V46  	F66 I68  	hbond S45


I'll try putting these into a cost fcn, with some attempt to reduce
the conflicts, and add in lots of George's RR constraints (swtiching
to "bonus" constraints below 0.6).


Sat Jul 24 16:35:22 PDT 2004 Kevin Karplus

try16 is full of breaks, but reasonably compact.  It made only a
slight improvement on try13 in the optimization, but it is the new
best with the unconstrained costfcn.

Maybe I should try a polishing run with an unconstrained cost function
(or perhaps helix and strand constraints, but no others) and give up
on this target.  We can submit whatever scores adequately with one of
the cost functions, and whatever Rosetta hates least, but I don't
think I have the energy or the inspiration to come up with a decent
model for this one.

Sat Jul 24 21:12:41 PDT 2004 Kevin Karplus

try17-opt2 looks very familiar---it is probably just a minor polishing
of try15 (In fact, the try15 costfcn score try17-opt2 first).

unconstrained cost function now orders
try16, try14, try17, try13, try15, try12, try5, try9, ...

Rosetta likes best the try14-opt2.repack-nonPC, but perhaps I should
say "hates least", since all the energies are enormous.
(Order currently is try14, try17, try15, try16, try13, ...)


Sun Jul 25 22:04:01 PDT 2004 Kevin Karplus

unconstrained now orders them
try16, try14, try17, try18, try13, try15, ...

Rosetta orders them
try14, try17, try15, try18, try16

strands.costfcn orders them
try17, try15, try14, try16, try18, ...

I think we should submit

	try17-opt2
	try14-opt2.repack-nonPC
	try16
	try1
	best model from alignment
	

Mon Jul 26 17:10:55 PDT 2004 Kevin Karplus

The superimpose-best script shows 
ReadConformPDB T0202.try18-opt2.pdb
ReadConformPDB T0202.try14-opt2.pdb
ReadConformPDB T0202.try17-opt2.pdb
ReadConformPDB T0202.try1-opt2.pdb

InFilePrefix 
ReadConformPDB T0202.t2k.undertaker-align.pdb model 1

was this a result of discussion or did I just forget?
The README is more recent, so I'll go with it (also try17 looks better
to me than try18).


Sun Sep 19 10:21:33 PDT 2004 Kevin Karplus

I put REAL_PDB:=1suwA into the Makefile and did a whole-chain rmsd evaluation.

The order for our submitted models and the robetta models is
	model5, robetta2, robetta5, model4, model3, robetta1,
		robetta4, model1, model3, model2
	
The model5 rmsd is artificially good, because the model is incomplete.
So this appears to be a target that robetta beat us on, with both
robetta2 and robetta5 better than our models, and the fully automatic
model4 beating our hand-improved models.

There are better models than any we submitted: try6-opt2, and
try11-opt2 would both have beaten robetta5.


Wed Sep 22 10:37:49 PDT 2004 Kevin Karplus

The model5 number was bogus.  Using GDT score to evaluate the models
we get
	robetta1   20.48%
	robetta5   19.68%
	robetta2   18.78%
	robetta4   15.96%
	robetta3   14.66%
	try10-opt1 10.64%	our best
	model5     10.14%
	model1	    9.44%
	model2      8.63%
	model3      8.23%
	model4      8.13%

These numbers are terrible.

Fri Sep 24 13:35:46 PDT 2004 Kevin Karplus

Switching to smooth GDT, we get terrible results still:
name			length	missing_atoms	rmsd	rmsd_ca	GDT		smooth_GDT

robetta-model1.pdb.gz	249	 0.0000		23.1629	22.4953	-20.5823	-19.7968
robetta-model5.pdb.gz	249	 0.0000		22.0141	20.9691	-20.3815	-18.9094
robetta-model2.pdb.gz	249	 0.0000		19.9898	19.1176	-18.7751	-18.2417
robetta-model4.pdb.gz	249	 0.0000		23.1815	22.6266	-16.0643	-15.4171
robetta-model3.pdb.gz	249	 0.0000		23.5823	22.5253	-14.7590	-14.4397
T0202.try3-opt2.pdb.gz	249	 0.0000		24.6744	23.8962	-10.2410	-10.1982
model1.ts-submitted	249	 0.0000		23.3773	22.6291	-9.4378 	-9.5230
model5.ts-submitted	249	1509		11.0238	 9.9730	-9.2369 	-8.9004
model2.ts-submitted	249	 0.0000		23.6508	23.0157	-8.6345 	-8.4842
model4.ts-submitted	249	 0.0000		22.3748	21.9120	-8.1325 	-8.1672
model3.ts-submitted	249	 0.0000		23.1261	22.4288	-8.2329 	-8.1612