Tue Jun 29 09:01:20 PDT 2004
T0219

Due	9 aug

Tue Jun 29 12:41:40 PDT 2004	Kevin Karplus

Comparative model, with strong hits to b.82.2.6.
They look like basically full-length alignments, and all the top
alignments agree on almost everything.


Tue Jun 29 16:09:13 PDT 2004	Kevin Karplus

The try1-opt1 model looks ok for most of the model, but the first
predicted strand P30-R34 is detached---it may need a SheetConstraint,
perhaps parallel to E206-L210 perhaps between N61-M64 and V133-K137.
I should look at the templates to see what they do.


Wed Jun 30 00:20:51 PDT 2004	Kevin Karplus

try1 looks pretty good, but has lost some of the initial sheets.
For try2, I'll copy sheet constraints from the first two alignments
(to 1mxeA and 1h2kA), since they seem to be consistent.

The conserved residues are clustering nicely, so I'm pretty certain of
this model.

For try2, I'll start again from the alignments, but this time I'll be
using sheet constraints.  Assuming that works well, I'll then tweak up
the sheet constraints or just polish the resulting model.

Wed Jun 30 07:08:56 PDT 2004	Kevin Karplus

SCWRL got wedged on one of the 2cavA alignments and had to be killed.

Wed Jul 14 14:55:11 PDT 2004 Kevin Karplus

A big chunk of beta sandwich looks pretty good, but there is a helix
straightening out (V4-I13), and some strands curling up (V70-S74,
A96-D101).

The structure looks like 2 right-handed beta helices screwed together
(alternating strands from each), but the inserted alpha helices
disrupt the structure at the C-terminus.


Tue Jul 20 15:23:35 PDT 2004	Sol Katzman

Concentrating on the 1mzeA template, define the strands as follows:

S1 	StrandConstraint T5   S7   1
S2 	StrandConstraint E10  S11  1
S3 	StrandConstraint L31  R34  1
S4 	StrandConstraint Y59  V66  1
S5 	StrandConstraint F81  A88  1
S6 	StrandConstraint G95  T103 4
S7 	StrandConstraint K130 K137 1
S8a 	StrandConstraint G140 F142 1
S8b 	StrandConstraint A144 F146 1
S9 	StrandConstraint Y149 Q157 1
S10 	StrandConstraint E159 K166 1
S11 	StrandConstraint A205 T211 1
S12 	StrandConstraint T214 P219 1
S13 	StrandConstraint W223 S228 1
S14 	StrandConstraint T232 G240 1

In the template, strand S8 has a kink due to a Pro aligned to K143.

The 3 strongly conserved residues form a probable active site
in the template, corresponding to:
   H145, D147, H224

Given the closeness to the active site, I am leaving the break in S8
at K143.

Based on the template, there is a fair amount of curling between
S4 and S5, so I don't necessary believe that V70-S74 should be
part of the strand, despite the moderate prediction in t2k (not
predicted at all in t04).

On the other hand, there is a clear need to complete the sheet
between S4 and S7 in try2, and the template indicates that S6
should go antiparallel in between those two. So I have made a guess
at the phase of Hbonding and included two new SheetConstraints for
try3. I also increased the hbond_geom_beta and hbond_geom_beta_pair,
as well as the overall weight for constraints.

Try3 leaves all the t2k and t04 alignment constraints and reads in
all the alignments, but I removed the redundant SheetConstraints in
favor of a single set with higher weighting.

Aside: S6 is only predicted by t2k-str2, not t04-str2.
Aside: the strongly conserved F21 may be an artifact, since it does
       not appear to be in the initial helix of the template 1mzeA.

The rasmol script pointed to by "strands" defines all the above
strands "S1" through "S14" as well as "bind" for the 3 active site
residues.


Wed Jul 21 15:23:54 PDT 2004	Sol Katzman

The topology of the sheets from 1mzeA is as follows
(with long links indicated that could otherwise be
short turns):

                                  xlongx
                                  x    x
                        ^    S    ^    S    S
                        S8   S13  S10  S11  S1
                        S    S    S    S    S
                        P    S    S    S    S
                        S    S    S    S    S
         xlongx         S    V    S    V    V
         x    x
         S    ^    S    ^    S    ^    S    ^   ^
         S5   S4   S6   S7   S14  S9   S12  S3  S2
         S    S    S    S    S    S    S    S   S
         S    S    S    x    S    S    S    S   S
         S    S    S    S    S    S    S    S   S
         V    S    V    S    V    S    V    S   S
                   x    x
                   xlongx


The results of try3 had the strand S6 placed between S4 and S7 as desired,
but the hbonding of S4 and S5 was disrupted and S6 did not hbond to S5.
Also lost was the S2-S3 bonding and some of the S3-S12 bonding. I will
increase the weights of a few SheetConstraints.

Also, try3 (and try2) had a kink in the hbonding of S9 to S14, with the
following inconsistent constraints: 
try3-opt2.sheets:SheetConstraint Y149 Q157	G240 T232	hbond T150
try3-opt2.sheets:SheetConstraint I156 G158	A234 T232	hbond Q157

But the problems with the strands are minor, since the locations seem
pretty good. What is more problematic is the placement of the helices.

The C-terminal helices are longer than those in the 1mzeA template,
and seem to be folding in a direction which partially blocks access
to the binding site in the barrel as the helices trail out of S14.

Looking at the rr.280.constraints file, not many of the top scoring
ones involve residues above G240 (the end of S14) but let's add a bunch
of them to try4.

It is also worth tightening up the Fe binding site. These are measured
from the template 1mzeA:                     T0219
Distance ASP189A.OD1-HIS187A.NE2: 3.824    # D147 - H145
Distance ASP189A.OD2-HIS267A.NE2: 2.943    # D147 - H224
Distance HIS267A.NE2-HIS187A.NE2: 3.233    # H224 - H145

Due to the recent change in undertaker, we also need to increase the
weighting of hbond_geom in the cost function.


Thu Jul 22 14:26:16 PDT 2004	Sol Katzman

Due to a power outage, we only got try4-opt1. The C-terminal helices
did move but some of the strands were also lost. I suspect that too
many of the rr.280 constraints were used. Instead, I will take a
different tack and will simply use scaffolding constraints from the
1mze template to pin the two long C-terminal helices K277-N294 and
L298-S306 since these seem to have good alignment to the C-terminal
1mze helices.

To pin these I use a conserved metal binding atom and the start and
end of S14

		1mze	T0219
helix start	K297	K277
helix end	G317	N294
helix start	P319	L298
helix end	K331	S306

conserved H	H267	H224
S14 start	T278	T232
S14 end  	Y285	G240

These distances are measured in rasmol on the template 1mzeA:

Distance HIS267A.CA-LYS297A.CA: 18.588
Distance HIS267A.CA-GLY317A.CA: 32.787
Distance HIS267A.CA-PRO319A.CA: 26.850
Distance HIS267A.CA-LYS331A.CA: 32.540

Distance THR278A.CA-LYS297A.CA: 34.494
Distance THR278A.CA-GLY317A.CA: 44.096
Distance THR278A.CA-PRO319A.CA: 38.485
Distance THR278A.CA-LYS331A.CA: 48.224

Distance TYR285A.CA-LYS297A.CA: 19.516
Distance TYR285A.CA-GLY317A.CA: 32.504
Distance TYR285A.CA-PRO319A.CA: 29.294
Distance TYR285A.CA-LYS331A.CA: 39.823

So try5 has eliminated the rr.280 constraints, and added the above
distance constraints. I also increase the weight of constraints in the
cost function.

Since try5 was still applying all the alignment constraints, as
well as doing the TryAllAlign, I will also launch a try6 in parallel
that starts from the existing tries and does not use the alignment constraints.

Fri Jul 23 21:06:42 PDT 2004	Sol Katzman

It turned out that try5 was quite successful in packing the C-terminal
helices back on the side of the sheets out of the way of access to
the Fe binding site. Interestingly, try6 still had a helix that potentially
could block the open end of the sheets.

Looking at breaks and clashes, it looks like Rosetta hates all the models
so far and there are lots of breaks and clashes. So try7 will use the same
techniques as try 5 (read in all the alignment constraints and TryAllAlign)
but will increase the cost weights for soft_clashes and breaks.


Mon Jul 26 10:36:02 PDT 2004	Sol Katzman

In try7, the C-terminal helices after strand s14 were not as far out
of the way of the binding site as in try5, but were better than in try6.

Using an unconstrained cost function copied from Kevin's in T0228, the
best scores are for try6 and try1, followed by try3,try2,try5.

For try8, I will (as in try6) use the existing tries rather than TryAllAlign,
and remove the constraints from the alignments. However, in addition to
decreasing the cost function weight for 'constraint', I will increase the
individual weights on the constraints for the metal binding site and the
scaffolding for the C-terminal helices. I also increase the InitMethodProbs
for ReduceConstraint.

Mon Jul 26 17:38:35 PDT 2004	Sol Katzman

As we discussed in the Monday group meeting, and as noted above, the
main issue with the existing models is the placement of the C-terminal
helices. Kevin suggested pursuing the following approaches:
   a) separate that region (from A243 on) into a separate domain
      and try to model it, then place it somewhere with the rest
      of the model.
   b) use distance constraints to pin the helix into a cleft in
      the existing model where there are some exposed residues that
      are predicted buried by the Near models.

For approach (b), the folowing are some points on the weakly
predicted helices from try5 that are predicted as buried, that can be
matched to the points inside the cleft predicted to be buried:

on helix	in cleft
A251.CB 	Y25.CE2,  Y217.CE2
F262.CZ 	I237.CD1, A131.CB

These will be included in try9, which otherwise is like try5.


Tue Jul 27 15:05:06 PDT 2004	Sol Katzman

Disappointingly, in try9, the helices are nowhere near where I tried
to constrain them. Looking at try9-opt2.constraints, my 4 constraints
were in the top 5 positive costs but this was not good enough. By comparison, 
in try5, I had 12 constraints to pin back the helix, so for try10
I will increase the weights of these constraints (1.5 -> 6.5).

Incidentally, rosetta absolutely detested try9. Also, try8,try6,try5 all
scored better than try9 using try9.costfcn, which seems odd.

Tue Jul 27 22:22:58 PDT 2004	Sol Katzman

In try10, the first of the C-terminal helices, comprising Q264-N268,
is now closer to the cleft as desired. The next helix L280-L291
has four Leucines, which we will constrain near the next cleft
in try11:

on helix	in cleft
L280.CD2 	L152.CD1
L284.CD2 	F154.CZ,I237.CD1
L287.CD1	skip this one for now
L291.CD1	L235.CD2,I156.CD1


Wed Jul 28 13:55:36 PDT 2004	Sol Katzman

In try 11, the helix Q264-N268 curled away from the cleft that it
lined up with in try10. Also, the helix L280-L291 does not seem
to have ended up where desired. For try12, increase the weight
of constraint, and remove most of the predicted constraints. Instead,
just use the HelixConstraints from t04.str2.constraints and 
t2k.str2.constraints

Thu Jul 29 15:55:23 PDT 2004	Sol Katzman

Try12 looks terrible. We ended up with the helix Q109-E120 blocking
access to the binding site. For try13, go back to the full set of
predicted constraints, leave the high (45) weight on constraint,
but also reduce the weight of break to 5.


Fri Jul 30 09:30:28 PDT 2004	Sol Katzman

Try13 looks better. The residues from A243 to L291 comprising several
small portions of helices are now packed against the rest of the structure.

But some problems have been introduced:

a) the connector residues D68-R80 between stands S4 and S5 have moved
   into a bad position in front of the beta sandwich, blocking
   access to the binding site

b) there is a bad break at S128 in the connecter between strands
   S6 and S7, just before the start of strand S7.

And the following problems remain:

c) W244 and M248 are exposed on the surface and predicted as buried.

d) the residues from P300 through F330 are floating away from
   the rest of the structure (in several small helices).

I do not have any good ideas on where P300-F330 should go,
but we can try to resolve both (c) and (d) at once by trying
to get F330 close to W244 for try 14. 


Fri Jul 30 18:25:02 PDT 2004	Sol Katzman

In an attempt to try something different, we can also pursue this track:
   a) separate that region (from A243 on) into a separate domain
      and try to model it, then place it somewhere with the rest
      of the model.

So I have created subdirectory 243-330 to hold that domain and
am running the scripts to fill it up.


Sun Aug  1 10:25:10 PDT 2004	Sol Katzman

For try14, there is no indication that F330 got any closer to W244
despite the added constraint. 

For try15, start from all previous tries (read-pdb-under) rather
than TryAllAlign. Pretty much the same constraints at in try14,
but increase the weight of wet and dry costs.


Mon Aug  2 10:34:55 PDT 2004	Sol Katzman

Try15 looks a lot like try14. Maybe it is time to try to pack it
in a little tighter. Increase the wet and dry weights while decreasing
the constraint weight slightly and removing the W244-F330 constraint,
since it seems not to be relevant.

Taking the other tack, I have used DeepView to merge the results of
the domain 243-330 try1-opt2 with full model try15-opt2. I produced
two chimera:

T0219.dv.merge1.pdb
T0219.dv.merge2.pdb

These differ in the placement of the set of helices that comprise the
243-330 try1 domain. These helices are placed behind (outside) each
of the two sides of the beta-sandwich that enclosed the metal binding
site. These chimera are used to generate try16m1 and try16m2, where in
both cases we use TryAllAlign, but also read in merge1 or merge2, but
no other existing tries. The constraints are the same as in try16.
I do not know if this is the correct approach to using these merged files.


Tue Aug  3 04:48:40 PDT 2004 Kevin Karplus

I've downloaded several models and will look at them when I have time
(my battery is running out).  The method Sol used for including the
chimeras is probably ok, but the cost function may need to be modified
to include predicted information from the subdomain.


Tue Aug  3 10:15:57 PDT 2004	Sol Katzman

Try16 looks much like try15, as expected. While waiting for the results
of the merged runs, I will continue polishing with try17, by increasing
the weight of soft_clashes and breaks, decreasing the weight of constraints.
Also, reduced gen_size to 100 (opt1,opt2) and super_num_gen to 200 (opt2).


Tue Aug  3 16:14:19 PDT 2004	Sol Katzman

Interestingly, although  dv.merge1 and dv.merge2 were designed with
the C-terminal helices on the opposite sides of the beta sandwich,
the results from runs try16m1 and try16m2 both had the C-terminal
helices on the same side. The try16m1-opt2 packed these helices in
much closer to the rest of the model. The try16m2-opt2 has some of
them sticking out into space and is not a good model.

Note: the try16m1 and try16m2 have the C-terminal helices on the same
      side of the beta sandwich as dv.merge1


Wed Aug  4 02:10:25 PDT 2004 Kevin Karplus

I don't particularly like the way that the merge1 model buries the
hydrophilic side of the helix near A250-S258.  The merge2 models are
not much better.  They may be better than the alternatives though.
All the models have big holes in them still.


Wed Aug  4 18:01:50 PDT 2004	Sol Katzman

Of the models so far, rosetta likes try1 and try8, followed by 
try17,try6,try3,try16,try7,try16m1.

I will do some polishing on try16m1 by reading in all the try16m1
models, eliminating the constraints that were attempting to place the
C-terminal helices in various clefts, and increasing the weight of
soft_clashes and break in the cost function. That will be called
try17m1.


Thu Aug  5 13:42:43 PDT 2004	Sol Katzman

Rosetta still likes try1 and try8, but now likes try17m1 next,
followed by try17,try6,try3,try16,try7,try16m1 as noted above.

The try17m1 costfcn likes try8 best, then try6 and try17m1.

For further polishing, try18m1 will read in all the try16m1 and 
try17m1 models, plus dv.merge1. It will not TryAllAlign. Also
increase the wet and dry weights a little, and decrease the
weight of constraints (40 -> 30)


Fri Aug  6 11:40:57 PDT 2004 Kevin Karplus

try18m1 has some pretty big breaks.  May need to reoptimize with
breaks turned up somewhat.  (S128-S129 may cause problems, forming a knot.)
Helix W244-L252 has to turn sideways to fill gaps at each end.

Increasing the RR constraints may help W244 and L245 move to a better place.

I've set up try19m1.costfcn, but not started a run with it.
It should use read-pdb.under, since try19m1 scores the try18m1-opt2
file best.


The other alternative, try17-opt2, has smaller breaks, but many of them.
For try18, I took the c-terminal packing constraints of try17 and
merged them with the try19m1 costfcn.  We may have to do some tweaking
of weights to get try17-opt2 to score best, or just do the models that
scored as badly as try17-opt2.


Fri Aug  6 13:30:15 PDT 2004 Kevin Karplus

I redid the contact prediction for 243-330, but there really was no
point to it, since that region is an ORFan (no homologs in nr).


Sat Aug  7 14:12:25 PDT 2004	Sol Katzman

Not much improvement with try19m1. Still have large break at S128
and still have helix W244-L252 in the wrong orientation relative
to where it should connect.

Similarly, try18 still has lots of breaks, though it is a little
better than try17. (Kevin's statement above that try17 had smaller
but more breaks than try18m1 was actually arguable, when you sort them.)

Conformation[15] T0219.try19m1-opt2.repack-nonPC.pdb has 91 breaks
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)S129 with cost 18.2877
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)W244 with cost 17.7462
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)L189 with cost 13.5307
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)R253 with cost 8.50847
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)H176 with cost 8.25258
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)Y59 with cost 8.18638
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)A97 with cost 7.31736
	T0219.try19m1-opt2.pdb.gz breaks before (T0219)V267 with cost 6.91126

Conformation[65] T0219.try18-opt2.pdb.gz has 96 breaks
	T0219.try18-opt2.pdb.gz breaks before (T0219)S278 with cost 33.2098
	T0219.try18-opt2.pdb.gz breaks before (T0219)K277 with cost 32.7518
	T0219.try18-opt2.pdb.gz breaks before (T0219)D204 with cost 20.1485
	T0219.try18-opt2.pdb.gz breaks before (T0219)L189 with cost 18.873
	T0219.try18-opt2.pdb.gz breaks before (T0219)S129 with cost 17.1498
	T0219.try18-opt2.pdb.gz breaks before (T0219)R39 with cost 13.5128
	T0219.try18-opt2.pdb.gz breaks before (T0219)A97 with cost 8.84064
	T0219.try18-opt2.pdb.gz breaks before (T0219)L291 with cost 8.04616
	T0219.try18-opt2.pdb.gz breaks before (T0219)R263 with cost 7.95241
	T0219.try18-opt2.pdb.gz breaks before (T0219)G195 with cost 7.02264

Using the try19m1 costfcn, these are the best models:

T0219.try19m1-opt2.pdb.gz		190.9343
T0219.try19m1-opt1.pdb.gz		193.4771
T0219.try18m1-opt2.pdb.gz		195.0767
T0219.try19m1-opt2.repack-nonPC.	196.3990
T0219.try8-opt2.pdb.gz			198.3306
T0219.try18m1-opt2.repack-nonPC.	198.9315
T0219.try18m1-opt1.pdb.gz		199.6523
T0219.try8-opt1.pdb.gz			200.6519
T0219.try19m1-opt1-scwrl.pdb.gz		201.5967
T0219.try6-opt2.pdb.gz			201.7963
T0219.try8-opt2.repack-nonPC.pdb	202.2088
T0219.try8-opt1-scwrl.pdb.gz		203.2163
T0219.try17m1-opt2.pdb.gz		203.4750
T0219.try6-opt1.pdb.gz			205.2500
T0219.try18m1-opt1-scwrl.pdb.gz		205.3872
T0219.try6-opt2.repack-nonPC.pdb	205.5587
T0219.try18-opt2.pdb.gz			205.8308

And here are the ones that Rosetta likes best:

T0219.try1-opt2.repack-nonPC.pdb:totals      1789.1
T0219.try8-opt2.repack-nonPC.pdb:totals      2146.9
T0219.try17m1-opt2.repack-nonPC.pdb:totals   3165.3
T0219.try19m1-opt2.repack-nonPC.pdb:totals   3431.6
T0219.try17-opt2.repack-nonPC.pdb:totals     4169.5
T0219.try6-opt2.repack-nonPC.pdb:totals      4184.7
T0219.try18-opt2.repack-nonPC.pdb:totals     4813.3

Even though it scored well, try8 does not look very good to me,
with some extremely large holes. Let's submit the rosetta repack
version of try8 only. Also, there does not seem to be much point
in submitting the rosetta repack version of try1. So I would like
to submit these:

   T0219.try19m1-opt2.pdb.gz
   T0219.try18-opt2.pdb.gz
   T0219.try19m1-opt2.repack-nonPC.pdb
   T0219.try8-opt2.repack-nonPC.pdb
   T0219.try1-opt2.pdb.gz