Mon May 22 09:36:51 PDT 2006
T0295
Make started Mon May 22 09:37:39 PDT 2006
Running on lopez.cse.ucsc.edu

Mon May 22 09:42:45 PDT 2006 Kevin Karplus

The submitters mention that t0295 has a dimer in the unit cell.
BLAST finds very high similarity to 1zq9A (47%id over 276 residues)
for an E-value of 4.7e-69.

Mon May 22 10:29:24 PDT 2006 Kevin Karplus

The t06 multiple alignment finds 27 PDB sequences in the multiple
alignment, and the t04 alignment finds 21.

Mon May 22 11:21:44 PDT 2006 Kevin Karplus

1zq9A is coming out on top in the t04 scorings (it isn't in the t06
template library, at least not yet).  It looks like there are at least
74 templates in the c.66.1 SCOP superfamily, so there should be plenty
of variety available for modeling any variable loops.


Mon May 22 14:17:52 PDT 2006 Kevin Karplus

As expected, 1zq9A scores best in T0295.best-scores.rdb.
Not far behind is 1qyrA, then a big jump to 1qamA and others.

The conservation in the multiple alignments is focussed mainly on the
first 100 residues, particularly in the t06 alignment. The t2k
alignment also shows considerable conservation for F158, P161, and
P163, but I don't know whether the difference is in what sequences are
aligned or the quality of the alignment.   Most likely the number of
sequences, as t06 has 4174 sequences and t2k has only 3372.

The target protein is "dimethyladenosine transferase, putative
[Plasmodium falciparum 3D7]", 1zq9A is "probable dimethyladenosine
transferase" from Homo sapiens, and 1qyrA is E coli's "High level
kasugamycin resistance protein KsgA", which is also in the SCOP family
"rRNA adenine dimethylase-like".  This appears to be a fairly ancient
fold, appearing both bacteria and eukaryotes (which is not surprising
for something associated with the ribosome).

Mon May 22 15:53:38 PDT 2006 Kevin Karplus

The alignments to templates are in excellent agreement for the
N-terminal sheet, but the C-terminal helices seem to be a bit
scattered. I hope that the top template really indicates where they go!

Mon May 22 20:13:16 PDT 2006 Kevin Karplus

This is clearly a 2-domain protein, with a domain break somewhere
around S177-T181.    I will do a subdomain prediction for S177-F275.

It looks like T0295.try1-opt1 has a bad misalignment of one strand:
S136 to C146 should probably be antiparallel to S167-P174 in some
alignment rather than being wound into a helix, though none of the
alignments in undertaker-align.sheets has such a pairing.

Looking at the models from alignments, only model 1 (from 1zq9A) has
the strand wound into a helix.  Maybe I should not believe so strongly
in 1zq9A, and see what happens if I take out the sheet constraints
from that alignments. (Or maybe I should believe 1zq9A and ignore the
secondary-structure prediction.)

Mon May 22 20:48:35 PDT 2006 Kevin Karplus

The S177-F275 region is clearly based on 1zq9A---nothing else comes close.
This region is 43.6% identical over 94 residues (BLAST e-value 1.2e-16).

Perhaps I should chop off the C-terminal domain and see whether that
causes the first domain to find a different template.  I'll start a
prediction for H1-T181.

It looks like C113, C88, and C146 may be coordinating a metal ion, if
C146 is in the right place. But C88 and C113 are on opposite sides of
the sheet, suggesting a misalignment by 1 of some strand (either
C113-Q119 or V87-N91.  Of course, none of these CYS residues are
conserved, so there may be nothing here of interest.

There is a strand misalignment in try1-opt2 (based on burial patterns).
I think that L172 should be antiparallel to V115, not A114.  When
looking at the space-fill structure, I'm much less convinced of this,
as V115 is covered by a helix, but K171, which anti-parallels it, is
very exposed.  Perhaps the current alignment is the best one.

According to Blast, 1zq9A is the best match for the first domain also,
with 50% identity over 176 residues.  The next best is 1qyrA at only
29% identity.

Thu May 25 15:08:39 PDT 2006 Kevin Karplus

The S177-F275 try1-opt2 prediction looks good, as does the H1-T181.
We should do a superposition with the main try1-opt2 and see if we can
make a chimera and optimize as a whole.

Thu May 25 16:41:36 PDT 2006 Kevin Karplus

Other than at the very edges of the domains, the subdomain predictions
superpose nicely with the full try1-opt2.  If we want to make a
chimera, we can take
	H1-L172		from domain1
	I173-D184	from whole chain
	E185-F275	from domain2
and then reoptimize with break weights turned up to clean up the joins.
It's not clear that the effort would be worth it.

Sat May 27 18:51:00 PDT 2006 Kevin Karplus

I picked up the server predictions, and SAM_T06_server_TS1 scores best
(even better than try1-opt2) with the try1 costfcn.  Several other
servers do fairly well (BaysHH_TS1-scwrl, GeneSilicoMetaServer_TS2,
RAPTOR-ACE_TS4-scwrl, ...

Maybe I should do a polishing run that takes in all the server models
(and current models) and tries polishing them up.

Sat May 27 19:07:48 PDT 2006 Kevin Karplus

I'm trying a polishing run as try2, but I'm not sure what undertaker
will do with incomplete conformations---it may cause a crash in the
optimization.

Sat May 27 19:41:29 PDT 2006 Kevin Karplus

undertaker seems to be coping ok with the missing residues, but there
were some PDB files not read, probably because of Windows trash (^M at
the ends of lines) that were not properly read by the crude PDB parser
that we borrowed from UCSF.

In any case, the try2 run is progressing, very occasionally making a
tiny improvement to SAM_T06_server_TS1


Sun May 28 09:43:20 PDT 2006 Kevin Karplus

Both undertaker and rosetta like try2-opt2 better than try1-opt2.

Sun May 28 09:50:55 PDT 2006 Kevin Karplus

I just noticed that try2-opt2 puts the second domain
in a *very* different place than try1-opt2 and the alignments.  

While I'm willing to have one structure that moved the domain like
this, I think we need to do an optimization with some constraints to
maintain the packing of the domains.

Possible constraints (taken from try1-opt2):

Constraint	V130.CA	N254.CA	5 5.8 7
Constraint	L144.CA	E257.CA	7 7.8 9
Constraint	L144.CA	F183.CA	9 9.9 12
Constraint	I140.CA	E257.CA	7 7.8 9
Constraint	I140.CA L190.CA 8 8.7 11


Sun May 28 10:56:49 PDT 2006 Kevin Karplus

Even with these constraints added, SAM_T06_server_TS1 scores best, so
the problem arose in the try2 optimization, not in copying from the
server model.

Perhaps I should pick up constraints from that model instead of try1-opt2.

Constraint 	Y135.CA I253.CA 6 6.8  8
Constraint	R137.CA R191.CA 6 6.5  8
Constraint	I140.CA	E257.CA	6 7.1  8
Constraint	I140.CA L190.CA 8 8.6 10
Constraint 	N141.CA	D187.CA	7 7.5  9
Constraint	L144.CA E257.CA 7 7.7  9
Constraint	L144.CA	F183.CA	8 9.5 11
Constraint	F145.CA S178.CA 5 5.6  7

I put these into try4.costfcn and started a try4 run.  For the try4
run, I selected the models we generated, plus the top 200 server
models (according to the try4 scoring).  Actually, since the top
models were a mix of plain models and scwrled models, I ended up requesting
112 distinct server models and 112 scwrled models from them, in
addition to the 10 models from decoys.

Some of the requests failed.  For example, all the ones that claim to
be from RAPTORESS are not, probably because of ^M at the end of the
lines for RAPTORESS, which breaks the PDB reading in undertaker.

Sun May 28 12:03:37 PDT 2006 Kevin Karplus

I tried fixing libpdb (used by undertaker) to remove the extraneous ^M
characters and will rerun the scoring of the servers with
try4.costfcn.  This will probably show that I picked a few of the
wrong servers for the long try4 run, but I doubt that it will make any
difference, since the top server results will be the same.  The
RAPTORESS read failures were hiding the real RAPTOR-ACE_TS5-scwrl
scores (because of misnaming), but RAPTOR-ACE_TS5 was included in the
try4 run anyway.

Sun May 28 12:18:18 PDT 2006 Kevin Karplus

Foo! the fix to libpdb did not work.  Either I patched libpdb wrong or
I misdiagnosed the problem.  I'll have to try again.

Sun May 28 12:29:35 PDT 2006 Kevin Karplus

The fix was done only to pdb_read_record, but also needed to be done
to pdb_gzread_record.  It is now ok.

Sun May 28 13:12:27 PDT 2006 Kevin Karplus

The RAPTORESS models (which were not read successfully for the try4
run) are not bad, but there are several better server models, so there
is not much lost by omitting them.

Sun May 28 13:59:07 PDT 2006 Kevin Karplus

The best scoring of the server models (other than ours) with the try4
costfcn is BayesHH_TS1-scwrl.  Interestingly, the unscwrled model
scores much worse.  Clashes go up slightly with the scwrling and one
or two non-backbone Hbonds are lost, but the sidechain cost improves
greatly with scwrl.

Sun May 28 15:14:37 PDT 2006 Kevin Karplus

Looking at T0295.try4-opt1, it occurs to me that we might want to pack
I192.CG2 and CD1 closer to V210.CG1 and CG2.

Currently we have

Distance VAL210A.CG1-ILE192A.CG2: 4.298
Distance VAL210A.CG1-ILE192A.CD1: 4.316
Distance VAL210A.CG1-ILE192A.CG1: 4.985

Distance VAL210A.CG2-ILE192A.CG2: 3.957
Distance VAL210A.CG2-ILE192A.CG1: 4.575
Distance VAL210A.CG2-ILE192A.CD1: 4.651

and we could reduce them to around 3.3 (except the largest one, which
we should probably not try to constrain).

Sun May 28 23:04:12 PDT 2006 Kevin Karplus

try4-opt2 is the best-scoring so far, both with try4.costfcn and try5.costfcn.
grep-best-rosetta likes it best also.  Note: there was no try3 run, as
that costfcn was rejected in favor of try4.costfcn.

Mon May 29 08:41:28 PDT 2006 Kevin Karplus

try5 died with a segmentation fault (not even an assertion failure!)

I suspect that the problem is the number of conformations that are
missing a lot of atoms---the optimization routines are not set up to
handle incomplete conformations.  Hmm---that doesn't seem to be right,
as the OptConform command only adds conformations that are complete
(at least if use_all is set).

Mon May 29 11:35:45 PDT 2006 Kevin Karplus

The bug seems to have been one I introduced last night---I deleted a
Segement that was being replaced in find_breaks, but forgot to include
a check to make sure that there was something to replace it with
before doing the deletion.


Mon May 29 12:32:30 PDT 2006 Kevin Karplus

try5 seems to be running ok, and it looks like it will succeed in
closing all the gaps, which would be nice.

Mon May 29 14:11:27 PDT 2006 Kevin Karplus

Looking at try5-opt1, I see that there *is* still a break between S136
and R137, even though undertaker has lost sight of it.  I wonder how
that happened, as the CA-CA distance of 4.473 is much larger than the
3.8024 that is the ideal CA-CA distance.

Mon May 29 15:57:22 PDT 2006 Kevin Karplus

At least the clashes seem to be getting reduced in try5, so the
results maybe ok even if there is an error in the reporting of breaks.

Mon May 29 17:04:25 PDT 2006 Kevin Karplus

The breaks in try5-opt2 seem to be identical to those in try4-opt2.
Why did undertaker lose track of them??

The number of clashes was reduced and the number of H-bonds went up,
but the worst clash is the same:
other-bump: 1.68792 Ang (T0295)K84.CD and (T0295)R108.O threshold= 2.72575 cost= 0.943612

With the unconstrained.costfcn, try2 and try5 score almost the same.
Rosetta likes repacking try5-opt2 better.

Tue May 30 16:59:18 PDT 2006 Kevin Karplus

I think I've fixed undertaker, so I'll do another polishing run
(including the same server models as in try5, plus all the decoys
models) to see if I can reduce the breaks for real.  I've decreased
the constraint weight and increased the soft_clashes and breaks for
try6. 

Tue May 30 19:56:36 PDT 2006 Kevin Karplus

Although try6-opt1 scores slightly better than try5-opt2, it really
hasn't reduced the breaks and clashes---the tiny changes are from
sidechain repacking.  We seem to be trapped in a local minimum (albeit
a pretty good one).

Mon Jun  5 14:07:31 PDT 2006 Kevin Karplus

try6-opt2 slightly increased beaks and slightly decreased clashes
relative to try6-opt1.  Rosetta likes repacking try5-opt2 better than try6-opt2.

Mon Jun  5 14:26:41 PDT 2006 Kevin Karplus

The biggest difference between try1-opt2, try6-opt2,
SAM_T06_server_TS1, and the first undertaker alignment (to 1z9qA)
is in the hinging between the two domains. try6-opt2 has the most
"closed" of the four hinge positions, though they are all quite similar.

I think that we can do one polishing run and submit.
For try7 I increased soft_clashes and breaks (and, slightly,
pred_alpha2k, pred_alpha04, pred_alpha06).
I also decreased phobic_fit, since it favors the incorrect domain
orientation of try2-opt2.

Starting polishing run as try7 on orcas.

Mon Jun  5 16:13:04 PDT 2006 Kevin Karplus

try7-opt2 made tiny improvements in breaks and clashes, but rosetta
still prefers repacking try5-opt2.

I think I'll submit
ReadConformPDB T0295.try7-opt2.pdb
ReadConformPDB T0295.try5-opt2.repack-nonPC.pdb
ReadConformPDB T0295.try1-opt2.pdb

ReadConformPDB T0295.undertaker-align.pdb model 1
ReadConformPDB T0295.undertaker-align.pdb model 2

Mon Jun  5 16:21:51 PDT 2006 Kevin Karplus

Models submitted.

Wed Jun 14 10:12:10 PDT 2006 Kevin Karplus

solution released as 2h1rA.

Wed Jun 14 16:27:30 PDT 2006 Kevin Karplus

Our best submitted model is model3 with a GDT of 70.6% and our best
created model was try4-opt1-scwrl with 72.5%, but *lots* of servers did better.
3Dpro_TS2 had GDT of 82.15% !  Our server is right in the middle of
the servers.  The best TS1 model was HHpred2_TS1, with a GDT of 77.9%
(FUNCTION_TS1 had slightly higher GDT, but worse on other functions,
like RMSD).

I made a "best-post.pdb" file that has our models, then the best server
models.

The differences seem to be almost entirely hinging between the
domains, which I suspect is luck as much as anything else.


Fri Jul 14 11:47:06 PDT 2006 Kevin Karplus

With the improved evaluation in evaluate.unconstrained.pretty, our
best submitted model is still model3 (-0.90), but our best model is
try4-opt1-scwrl (-0.93).  SAM_T06 is 33rd of 54 TS1 models---pretty
feeble (-0.84).  The best server model was 3Dpro_TS2 (-1.22) though it
could have been improved slightly be scwrling.


Thu Sep 14 12:26:32 PDT 2006 Kevin Karplus

With the latest revisons to the real_cost evaluation, model3 is our
best submitted (-161.79), try4-opt1-scwrl is our best generated
(-168.16), and SAM_T06_server is -148.36  (19th of 54).