13 June 2002 Kevin Karplus

T0135 seems to be an alpha/beta protein, but we don't have a close
enough homolog to get a good alignment.  The try1 run has not kept the
sheet together, and has wound one of the strands into a helix.

25 June 2002 Kevin Karplus

The try2 run, using the new alignments and fragments from the STR HMM,
is not doing any better---none of the four strands is paired and the
C-terminal strand is wound into a helix.

There ARE some sheets in the initial alignments---perhaps things will
go better once H-bond scoring is added.  In the meantime, we could try
guessing the hydrogen bonding from the alignments, and adding constraints.


11 July 2002 Kevin Karplus

I'm remaking fragments with the new version of fragfinder, in the
hopes of getting the beta sheet to form by magic (unlikely without
H-bond cost functions).

The STR prediction has 4 strands.  The first predicted to be
antiparallel or mixed, the second to be anti-parallel edge, the third
to be parallel (center or edge), the last to be mixed.

We also have a long helix between strands 1 and 2, so they should
probably be oriented the same way.

What topologies are consistent with this prediction?

I don't see any.  The best I could do are
	^v^^
	2143	matches STR but has problem with helix between 1 and 2
	
	^^v^
	3142	doesn't match for strand 4.

Strand 2 could also be a parallel edge (almost as good as antiparallel)
which yields the following topology:	
	^^vv
	2143

So perhaps we could try adding some constraints. 

The try3 run gets a new best score (no constraints yet), but still
does not form a beta sheet.


18 July 2002 Yael/Jenny

Working on a try4 run, using the 
	^^vv
	2143
topology as a constraint.


18 July 2002 Yael/Jenny

Discovered bug in constraint definitions which caused improper orientation
of strands (all antiparallel to each other!). Will run again as try5, with 
the proper definitions.

20 July 2002 Kevin Karplus

Although try6-opt is the best-scoring, it looks terrible, with the
sheet not forming and one strand coiling up into a helix.  The try4
run looks much better, aside from one bad break.  The topology for
try4 is 
	^v^v
	4132
which is consistent with str predictions for 1 and 2, but not 3 or 4.
	
The strand 2-3 pairing did not have an Hbond constraints in try4, but
S45-T70 and N47-E68 paired by themselves.

The strand 1-4 pairing K10-F106, T12-D104, L14-V102 has register off
by 1 from the constraints provided for try4.  Maybe we should try
extending the pairing seen in try4 to make a better model with this
topology. 

Putting together the constraints derived from the alignment seen in
try4-opt, naturally try4-opt scores the best of any of the existing decoys.
Let's do a try7 run to see if we can improve things (particularly the
chain breaks).  I'll also set up the script to save the template atoms
for faster loading on future runs, and sun scwrl on each iteration to
try to keep the sidechain packing from interfering with the backbone folding.

The big question for us is whether this antiparallel topology is
correct, or whether we should be playing more with the ^^vv topology.
We could explore the ^^vv topology more by using CB constraints
instead of Hbond constraints---they are less sensitive to getting the
phase of the strands exactly right.

Perhaps we need to create new files with the constraints for the
different topologies, and just use an include command in define-score.script.
That will make it easier to switch between different score definitions.

20 July 2002 Kevin Karplus

Using the constraints in try7-4132^v^v.constraints,
T0135.try7-scwrl.30.30.pdb is the best scorer (outperforming
try7-opt-scwrl, which accidentally overwrote try7-opt).

In try7-opt, there is a pretty bad break between H65 and A66, R100
seems to have twisted around to the wrong side of the sheet, and the
whole thing is not packed as tightly as I'd like---still it is more
convincing than anything else I've seen for this target.

Perhaps we should do a run with just the 1-4 strand pairings, and see
what comes up out of the alignments and fragments.  Perhaps this would
help us decide what the right sheet toplogy is.  Using the
try8-14^v.constraints, the best current decoy is still
try7-scwrl.30.30, so it is likely that the try8 run won't find us
anything really great.  (Try7 started from the try4 conformation, so
was mainly doing tweaking, while try8 is starting from scratch.)

I've also submitted the try7-opt conformation to VAST (ID: VS29918
Password: casp5t0135).  This should give us an alignment to a real
structure that might provide a better core alignment than the ones
we've been working with.
The best alignment is to 2bopA, and it looks quite good, even having
16.7% identity and 1.7 rmsd over 54 residues, also good are alignments
to 1a7gE and 1fj7A all in SCOP class d.58 (d.58.7 or d.58.8).

2bopA	d.58.8
1qupA	d.58.17
1qd1B	d.58.34
1a7gE	d.58.8
1fj7A	d.58.7
1qm9A	d.58.7
1bs0A	c.67.1
1scjB	d.58.3
1fe4A	d.58.17
1b3tA	d.58.8
1cc8A	d.58.17
1kp6A	d.58.25
1mla	d.58.23
1h6kZ	d.58.7
...
I should grab some of the alignments out of the CN3D alignment editor and use
them as alignments to try in undertaker.

Note: our 3rd highest hit (1ha1) is from d.58.7.1, so this is probably
the source for that fold, though 1f9fA (d.58.8.1) is also one of the
high hits, as is 1fj7A (d.58.7.1).  Also d.58 (1louA) is the (weak) consensus
of the CAFASP servers.

Unless something else good emerges, we'll probably go with a d.58
prediction of some sort---if we're wrong, at least we'll be in good company.

It seems like even erroneous constraints (as in try4) drive the sheet
formation faster than a smaller set of supposedly correct
constraints---try8 is not getting nearly as many "good" structures as
try4, when judged on the try8 scoring function.
This suggests a strategy when trying to collapse beta-sheets---add
some arbitrary "collapsing" functions (like keeping the centers of the
strands near each other) and see what emerges, then try refining the
constraints to get cleaner sheets.

21 July 2002 Kevin Karplus

try8 never got as good a score as try7, but it did beat try4.
try8-opt-scwrl tries pairing D19-H65 and F17-F67, with no other
sheet-forming hbonds.  Note that these Hbonds are NOT ones that it was
looking for, and are off by 2 from the ones try7 was looking for
(T15-F67, F17-H65).  try8 looks like a dead end.

I picked up a lot of the VAST alignments to try7-opt, and tried
editing a few of them to lengthen the aligned regions---particularly
in places where try7-opt had bad breaks.  I selected alignments by
VAST P-value, by number of aligned residues, and by %identity.

For try9, I'll keep the limited constraints of try8, and try inserting
the VAST alignments (but not use try7-opt as a starting point, since
the clash reduction and optimization done there makes it hard to add
alignments or fragments).

Hmm---minor problem.  The files passed through a macintosh, since I
was working at home, and so are not proper UNIX files.  I had to use
emacs to read the files (emacs understands MAC files), then copy the
contents to another buffer (where emacs assumes UNIX format), then
save the file.

Also, before running try9, I changed the BEST_EVALUE threshold in
Makefile to allow in more alignments in the default scripts, remaking
T0135.t2k.best-scores.rdb T0135.t2k-2track-undertaker.a2m and
everything that depends on them.

Scores on the try9 run are looking much better than the try8 run---it
is almost certain to do better than the try4 run, and may compete with
try7. Hmm, looking at a couple of the early iterations, it seems that
strands 413 are joining nicely, and the predicted bend in the between 3
and 4 is modeled well, but strand 2 is way out in space. There are
double H-bonds between T12-D104, L14-V102, R16-R100,  T15-F67,
L13-S69, H11-F71, which are compatible with the constraints from try7.

If try9 doesn't fix strand2 in the final pool, I'll try using the
try7-4132^v^v.constraints, or creating a new set of constraints based
on what I see in the best of the try9 structures.

21 July 2002 Kevin Karplus

try9-opt-scwrl scores very slightly worse than try7-opt, but does not
have the strand 2 fixed.  I guess I'll have to do another run, with
the try7 constraints.  With this scoring function, try9 does rather
poorly (since the constraints for try2 are not met).

I'll seed the try10 run with a couple of the better-scoring try9
decoys, but not with the try7-opt decoy, since it is a local minimum
in the scoring function and may keep the algorithm from exploring more
of the structure space.

21 July 2002 Kevin Karplus

try10-opt-scwrl is not quite as good a score as try7-opt, but comes
fairly close, which is a bit surprising as strand2 has still not
attached to the sheet.  Perhaps I should try another run, with both
try7-opt and try10-opt-scwrl as initial conformations, seeing if some
crossover action will produce a better model.  Of course, what we
really need is a double-crossover, with the child of AAA and BBB being
ABA, since the 1st and 3rd strands are well placed, and only the 2nd
one is badly placed.  This operator would be a bit of a nuisance to
implement. 

21 July 2002 Kevin Karplus

try11-opt scores slightly better than try11-opt-scwrl---SCWRL reduces
clashes, improves the rotamer probabilities slightly, and reduces the
radius of gyration, but increases all the burial costs (except gen6.5).
I suspect that the scwrl result is slightly better---the extra burial
value probably comes from sidechains bumping into each other keeping
the helix too far from the sheet.

Try11 is celarly based mainly on try7--it does not have the desirable
bent helix of try 9, and it still has the bad break between H65 and
A66.  The structure is a little bit loose and "foamy"---I wonder how I
can induce a tighter packing.

I'll use the sumperimpose.under script to superimpose try9-opt-scwrl,
try10-opt-scwrl, and try11-opt-scwrl, then piece together a
max-and-match PDB file to use as a starting point.  Unfortunately, the
crufty old version of rasmol that runs on macs does not allow viewing
multiple models, so choosing the pieces will have to wait until I can
see them on my Linux box.


22 July 2002 Kevin Karplus

T0135.try9-10-11.super superimposes 3 decoys.
  model 1 is called T0135.try9-opt-scwrl.pdb
  model 2 is called T0135.try10-opt-scwrl.pdb
  model 3 is called T0135.try11-opt-scwrl.pdb

Note: these break reports use a 0-based numbering system---that should
be changed to use the PDBNum numbers.
T0135.try9-opt-scwrl.pdb has 2 breaks
	T0135.try9-opt-scwrl.pdb breaks before 67 with cost 0.0726077
	T0135.try9-opt-scwrl.pdb breaks before 97 with cost 0.0855783
T0135.try10-opt-scwrl.pdb has 2 breaks
	T0135.try10-opt-scwrl.pdb breaks before 66 with cost 0.049496
	T0135.try10-opt-scwrl.pdb breaks before 98 with cost 0.0423624
T0135.try11-opt-scwrl.pdb has 3 breaks
	T0135.try11-opt-scwrl.pdb breaks before 41 with cost 0.034909
	T0135.try11-opt-scwrl.pdb breaks before 65 with cost 0.0886167
	T0135.try11-opt-scwrl.pdb breaks before 100 with cost 0.0609589

For the helix or helices from K74 to somewhere in the low90s, I like
model 1  (try9-opt-scwrl) best, but I could change to model 2 easily at L97.


For strand2 (M43-H47), I like model 3 (try11-opt-scwrl) best, 
since it is the only one that really forms the Hbonds.

I improved the superposition algorithm used for superimposing the
models, and now have a much easier time doing cut-and-paste.

Let's try Cut-and-paste

strand1	-R16		model1
helix1+strand2	F17-G53		model3
minihelix+strand3 M54-E72		model2
helixcluster	S73-T96	model 1
strand4		L97-*	model1

22 July 2002 Kevin Karplus

try12-opt-scwrl, started from the cut-and-paste model, scores almost (but not
quite) as well as try11-opt-scwrl.  There are 5 breaks, including bad
ones at 20-21, 66-67, and 97-98.  I wish ReduceBreak were working
better---then maybe we could seal up the gaps and have a good model.

I tried using the new pred_alpha2 cost function in place of alpha and
alpha_prev---this makes the gap between try11-opt and try12-opt-scwrl
somewhat larger.

I'll try a new run starting from try12-opt-scwrl with the new cost
function, and with priors for ReduceBreak and CloseGap increased.
Already on the first iteration it gets a better score than try11-opt,
the previous best.


23 July 2002 Kevin Karplus

try13-opt is the new best score (better than try13-opt-scwrl).  There
are still 5 breaks: 24-25, 41-42, 66-67, 72-73, 97-98.  The worst of
them is the 97-98, with 66-67 close behind.  The spacing at 97-98 is
not too bad, but the OG of S98 is making a bond to C L97---changing
the psi angle of S98 would almost close the gap.  The one at 66-67 is
a large gap in mid-strand---one that I would have thought ReduceBreak
could fix.

The helices are packed rather loosely against the sheet, though it
looks like they could nestle in closer with a few sidechain rearrangements.

I looked at the superposition of try13-opt, try12-opt-scwrl,
try11-opt-scwrl, try10-opt-scwrl, and try9-opt-scwrl, to see if I
could find a way to close the gap at 66-67.  It looks like about the
best I can do is to copy 64-66 from try10-opt-scwrl into try13-opt.
This creates a new (smaller) gap, which I hope will get swept into the
sheet by CloseGap.  I created this chimera in T0135-cut-and-paste-2.pdb
It doesn't score as well as try13-opt (or many of the other try13
runs), mainly because H65 is still turned the wrong way, and sliding
it down doesn't flip it over.

Just noticed that I had the score function defined wrong---forgot the
coefficient after pred_alpha2, so the weights for it and contact order
were screwed up. The same observations apply after fixing the score
function. 

23 July 2002 Kevin Karplus

The best-scoring decoy is now try14-opt.  There are still 5 breaks,
and the one at 98-99 is particularly large.

I got a couple more alignments from VAST to one of the early
iterartions of try14 (cs29973, password casp5t0135), and so for try15
I'll start with just alignments, not seeding with a conformation.  I
don't expect this to do phenomenally well, but I hope to be able to do
a cut-and-paste between try14-opt and try15-opt.

24 July 2002 Kevin Karplus

Try15-opt-scwrl does not score as well as try14-opt, but has only 3
breaks, two of them (around 65 and 96) smaller than the corresponding
breaks in try14-opt.

I should copy H65 and A66 from try15-opt-scwrl to try14-opt, but the
break around 96 looks better in try14-opt, despite the worse score
(try15 had messed up the helix to reduce the gap.)

I should also copy Y29-M43, to close the gap at 41.

In try16 iterations, the H65 and A66 keep getting attached to T64
instead of F67, even though the break at A66-F67 is much smaller.
Perhaps this has to do with the way
AlignedFragments::merge_all_short_segments works, which tries to merge
with the longer neighbor first---it may be better to merge with the
closer segment first.

I killed try16 and will try again (try17) with
merge_all_short_segments changed.

Hmm, this isn't helping, since the cost of the break at 64 is higher
than the cost at 66, so undertaker still favors joining in the
way I like less.  How can I force the right join?  
I'll let try17 run while I think about it, since on the first
iteration it came up with a new best score.

Hmm---I'll try choosing randomly which segment to merge with, favoring
the closer one.  I'll run this as try18, but still let try17 finish,
since it may do better.  The first 4 iterations of try18 all merged
the short segment the way I DON'T like---I'll have to see if any of
the later ones managed to sample the way I DO like.

24 July 2002 Kevin Karplus

The new best scorer is try18-opt (not try18-opt-scwrl), but it still
has the bad break after A66.  I can also see how the helices and sheet
should inter-digitate, but I don't know how to get the packing to happen.

25 July 2002 Kevin Karplus

Maybe the new JiggleSubtree operator could help improve the packing?

Yes, it seems to---the best new score is T0135.try19.2.40 (undertaker
crashed, perhaps because I had recompiled it while running, so there
was no try19-opt.

Let's try another run, with OptSubtree as well as JiggleSubtree.
try20-opt is a new best score, beating even try20-opt-scwrl.
The scwrl run has a better score for most measures, but slightly worse
on contact order.

Perhaps I should do one more run, reducing the weight of contact
order, and using the new OptSegment and OptClash operators to try to pack the
helices tighter against the sheet.  Maybe I should temporarily add a
constraint to try to get the helices to pack better---maybe CD1 of L81
against C of L101 and C of V102 and CD1 of L94 against CB Q99 and CG2 T15.

Adding the constraints still leave try21-opt as the best scorer.

26 July 2002 Kevin Karplus

I tried twiddling the packing constraints a bit, using
Constraint 750	784	2 3.3 5	// CD1 L94	CB Q99
Constraint 750	115	2 3.2 5	// CD1 L94	CG2 T15

Constraint 806	691	2 3.2 5	// CD1 L101 	CG L86
Constraint 806	750	2 3.2 5	// CD1 L101 	CD1 L94

Constraint 100	651	2 3.2 5	// CD1 L13	CE2 Y80
Constraint 100	659	2 3.2 5	// CD1 L13	CG L81


Francisco Useche rebooted the machine that try22 was running on,
without telling me, so I'll have to start over on a different machine.


26 July 2002 Kevin Karplus

The try23-opt conformation looks pretty good, but is still a little
looser than I'd like.  Perhaps we can try to get some contact between
the aromatics F17 (CE1) and F93 (CZ), F46 (CE1) and Y33 (CE2), F71
(CE2) and Y80 (CD2).  Adding these constraints still leaves try23-opt
on top, so let's try another run, with the score parameters adjusted
to try to make packing more important.  Now try23.17.40 and
try23.16.40 beat try23-opt, so let's add them as starting points.

Hmm, try24 seems to be spending all its time doing CloseGap.
Maybe I made the prior probability of that too high.
The new best scorer is try24-opt, and it does look denser, but still
not as dense as I'd like.

Increasing the weight for the constraints and decreasing slightly the
weight for breaks and clashes should favor denser packing (at least
near where I imposed constraints).  Doing so makes try23-opt score
best again.  I'll try tweaking the weights a bit, and I'll turn off a
couple of constraints near the ends of the pair of helices I want to
pack in tighter, to allow a bit more flexibility (both K18-S98 bonds
and H11 N-O F71).


Let's start with both try23-opt and try24-opt and see
what happens. 

27 July 2002 Kevin Karplus

try25-opt is new best score.  If we turn off the "extra-packing.constraints",
then the best is try25-try23.15.40, with try25-opt as second best.

After re-optimizing (without packing constraints) new best is
try26-opt-scwrl (but after reading the pdb files back in, the cost
increases and try26-opt scores better).

I'll submit try26-opt-scwrl, and replace it if I can come up with a better one.
There is still a little looseness in the helix packing, and K74-Q78
has unwound a bit.  Let's try pasting in a helix (say from try13-opt)
for 74-81.

The helix gets unwound in a very similar way in try27-opt, but
try27-opt does score somewhat better than try26-opt, so I should resubmit.