Wed May 17 09:19:58 PDT 2006
T0291
Make started Wed May 17 16:20:47 PDT 2006
Running on shaw

Wed May 17 19:07:09 PDT 2006 Kevin Karplus

The t06 alignment has 226 pdb templates in it!

Wed May 17 19:35:40 PDT 2006 Kevin Karplus

RPSblast identifies the protein as tyrosine kinase:

    cd00192, 
	TyrKc, Tyrosine kinase, catalytic domain. Phosphotransferases;
	tyrosine-specific kinase subfamily. Enzymes with TyrKc domains
	belong to an extensive family of proteins which share a
	conserved catalytic core common to both serine/threonine and
	tyrosine protein kinases. Enzymatic activity of tyrosine
	protein kinases is controlled by phosphorylation of specific
	tyrosine residues in the activation segment of the catalytic
	domain or a C-terminal tyrosine (tail) residue with reversible
	conformational changes..

The best BLAST hits in pdb are
	1jpaB
	1mqbB
	1qcfA
	2ptk
	1y57A

1jpa[AB] is 79% identical and 90% positive with two 1-residue gaps over
234 residues.

1mqb[AB] is 72% identical and 84% positive with 1 1-residue gap over 214 residues.

These two are much better than the third best (1qcfA 42% identical and
62% positive with 5 gaps (up to length 4) over 119 residues.

1jpaA is in our template library, but 1mqbA is not.  Perhaps we need
to add it as MANUAL_TOP_HITS if it doesn't come out in the top few
hits by the automatic process.

Wed May 17 21:01:40 PDT 2006 Kevin Karplus

The t04 alignment has 54 pdb templates in it (fewer than t06, but
still a lot).


Make started Thu May 18 16:01:02 PDT 2006
Running on orcas.cse.ucsc.edu


I killed the jobs on shaw and restarted them on orcas, because shaw
seemed to be thrashing in muscle.  (I also removed the muscle
alignments from the pairwise alignment targets.)

Thu May 18 18:50:51 PDT 2006 Kevin Karplus

The top blast hits are very strong (79%id):
T0291	1jpaA	79.05	296	60	2	2	295	10	305	1e-139	490.7
T0291	1mqbA	72.05	297	82	1	10	305	29	325	6e-126	445.3
T0291	1qcfA	42.96	277	146	6	11	285	174	440	1.6e-60	228.0
T0291	2ptk	44.84	281	143	6	7	285	167	437	2.4e-59	224.2
T0291	1y57A	44.13	281	145	6	7	285	166	436	2.0e-58	221.1
T0291	1fmk	44.13	281	145	6	7	285	166	436	2.0e-58	221.1
T0291	1yolA	44.57	276	141	6	12	285	2	267	4.4e-58	219.9
T0291	1mp8A	39.93	268	158	2	21	288	11	275	7.6e-58	219.2
T0291	1yojA	44.00	275	144	5	12	285	2	267	9.9e-58	218.8

This ordering is somewhat different from the ordering by the HMMs,
though 1jpaA comes out high on that list also (4th).

Thu May 18 19:24:51 PDT 2006 Kevin Karplus

There seems to be something wrong with undertaker---the
T0291.undertaker-align.pdb.gz file has only one model, though there
are several in the undertaker-align.under script.
This is the only target that has shown this behavior, though it is
reproducible on both cheep and lopez, even after recompiling part of undertaker.


Thu May 18 19:36:27 PDT 2006 Kevin Karplus

I found and fixed the bug in undertaker, and killed the try1 that was running.
While I'm at it, I might as well add the top BLAST hits as MANUAL_TOP_HITS.

Make started Thu May 18 19:43:45 PDT 2006
Running on lopez.cse.ucsc.edu

Rerunning the make (which should be mostly no-ops until the undertaker section).

Fri May 19 08:53:18 PDT 2006 Kevin Karplus

This looks like a 2-domain protein, with a domain break around N110.
A lot of the conserved residues are in the cleft between domains, so
accurate modeling may be difficult without the ligand, which was not
provided (and we couldn't do anything with anyway).

Some of the secondary prediction looks a bit off, though the averaged
dssp-ehl2 predictions seem better than the str2 predictions.

Thu May 25 15:04:06 PDT 2006 Kevin Karplus

Have only done fully automatic prediction.  Need to do another run
with break and soft_clashes turned up.  Probably should just use the
MANUAL_TOP_HITS, maybe adding in the templates used in  try1:
	2b7aA, 1xbbA, 1jpaA, 1qpcA
We should score the server models.

After that we should probably do a polishing run from all existing
models.

Sat Jun  3 08:43:57 PDT 2006 Kevin Karplus

Starting an optimization run (try2 on lopez) using top alignments and
a costfcn with fewer constraints (but with clashes and breaks turned
up).  Also scoring the servr models with the try2 costfcn (if
undertaker doesn't crash on them).

Sat Jun  3 09:00:10 PDT 2006 Kevin Karplus

Partly, but not wholly, because the try2.costfcn uses the sheets from
try1-opt2 as contraints, try1-opt2 scores best with that cost fcn.
The second best is SAM_T06_server_TS1, which score 11 points worse
(3.3 of which is from the constraints).
ROBETTA_TS3 is next with fewer clashes and breaks, but a cost of 15
more on constraints.  ROBETTA_TS4 scores very well on breaks and
clashes, but the constraints add 28 to the cost.

There are some servers that do ok on the constraints but score
terribly (like MetaTasser, which seems to have grossly overcompacted
the structure).

A lot of servers are scoring better after SCWRLing---though sometimes
with soft_clashes going up, so it may just be a relative weight of
sidechain energy and clash avoidance.  (Robetta scores get worse after
scwrling.)

Sat Jun  3 09:28:38 PDT 2006 Kevin Karplus

With the unconstrained costfcn, try1-opt2 still scores best and
SAM_T06_server_TS1 second, but third is now ROBETTA_TS4, followed by
ROBETTA_TS3. After varous Robetta models, the scwrled RAPTOR_ACE
models come next.


Sat Jun  3 14:08:43 PDT 2006 Kevin Karplus

The try2 run scores slightly worse than try1 on the try2 cost
function, almost entirely due to constraints, though it also scores
worse on the unconstrained costfcn.

The main differences are in the N- and C-termini, which we are
unlikely to get right.  I think it is time for a polishing run, based
on the existing models. (Started as try3 on lopez)


Sat Jun  3 17:21:47 PDT 2006 Kevin Karplus

try3-opt2 scores best on the try3 and try2 costfcn, though try1-opt2 still
scores best on the try1 costfcn. Rosetta like repacking try3 the best,
though it still gives it positive energy.

Sat Jun  3 17:36:54 PDT 2006 Kevin Karplus

I'm starting one more polishing run, trying to close the remaining gaps.
(try4 on lopez).

Sun Jun  4 04:52:04 PDT 2006 Kevin Karplus

try4-opt2 now scores the best, and try4-opt2.repack-nonPC scores well
with rosetta.  I think we've reached the point of diminishing returns
here and might as well submit.

	try4-opt2
	try4-opt2.repack-nonPC
	try1-opt2
	undertaker-align 1	1p4oA
	undertaker-align 2	2evaA

Sun Jun  4 05:36:49 PDT 2006 Kevin Karplus

I ran into submission problems with the models from alignment, because
the duplicated atoms caused undertaker to remove th "wrong" duplicates.
# command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 1
WARNING: atoms too close: (T0291)D180.C and (T0291)P181.C only 0.0000000e+00 apart, marking (T0291)P181.C as missing
WARNING: atoms too close: (T0291)P181.N and (T0291)E182.N only 0.0000000e+00 apart, marking (T0291)E182.N as missing
WARNING: atoms too close: (T0291)P181.CA and (T0291)E182.CA only 0.0000000e+00 apart, marking (T0291)E182.CA as missing

removing P181 is ok, but removing E182 is wrong

# command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 2
WARNING: atoms too close: (T0291)L46.C and (T0291)K51.C only 0.0000000e+00 apart, marking (T0291)K51.C as missing
WARNING: atoms too close: (T0291)K47.N and (T0291)K52.N only 0.0000000e+00 apart, marking (T0291)K52.N as missing
WARNING: atoms too close: (T0291)K47.CA and (T0291)K52.CA only 0.0000000e+00 apart, marking (T0291)K52.CA as missing
WARNING: atoms too close: (T0291)K62.C and (T0291)Y65.C only 0.0000000e+00 apart, marking (T0291)Y65.C as missing
WARNING: atoms too close: (T0291)V63.N and (T0291)T66.N only 0.0000000e+00 apart, marking (T0291)T66.N as missing
WARNING: atoms too close: (T0291)V63.CA and (T0291)T66.CA only 0.0000000e+00 apart, marking (T0291)T66.CA as missing
WARNING: atoms too close: (T0291)S225.C and (T0291)Y226.C only 0.0000000e+00 apart, marking (T0291)Y226.C as missing
WARNING: atoms too close: (T0291)Y226.N and (T0291)G227.N only 0.0000000e+00 apart, marking (T0291)G227.N as missing
WARNING: atoms too close: (T0291)Y226.CA and (T0291)G227.CA only 0.0000000e+00 apart, marking (T0291)G227.CA as missing

removing K51 is ok, but removing K52 is wrong
removing Y65 is ok, but removing T66 is wrong
removing Y226 is ok, but removing G227 is wrong

Perhaps I should change the heuristic used for deciding which atoms to
mark as missing.  I could favor removing atoms from residues that are
already incomplete.

Sun Jun  4 05:51:38 PDT 2006 Kevin Karplus

OK, I changed undertaker:
# command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 1
WARNING: atoms too close: (T0291)D180.C and (T0291)P181.C only 0.0000000e+00 apart, marking (T0291)P181.C as missing
WARNING: atoms too close: (T0291)P181.N and (T0291)E182.N only 0.0000000e+00 apart, marking (T0291)P181.N as missing
WARNING: atoms too close: (T0291)P181.CA and (T0291)E182.CA only 0.0000000e+00 apart, marking (T0291)P181.CA as missing
# command:# ReadConformPDB reading from PDB file T0291.undertaker-align.pdb looking for model 2
WARNING: atoms too close: (T0291)L46.C and (T0291)K51.C only 0.0000000e+00 apart, marking (T0291)K51.C as missing
WARNING: atoms too close: (T0291)K47.N and (T0291)K52.N only 0.0000000e+00 apart, marking (T0291)K47.N as missing
WARNING: atoms too close: (T0291)K47.CA and (T0291)K52.CA only 0.0000000e+00 apart, marking (T0291)K47.CA as missing
WARNING: atoms too close: (T0291)K62.C and (T0291)Y65.C only 0.0000000e+00 apart, marking (T0291)Y65.C as missing
WARNING: atoms too close: (T0291)V63.N and (T0291)T66.N only 0.0000000e+00 apart, marking (T0291)V63.N as missing
WARNING: atoms too close: (T0291)V63.CA and (T0291)T66.CA only 0.0000000e+00 apart, marking (T0291)V63.CA as missing
WARNING: atoms too close: (T0291)S225.C and (T0291)Y226.C only 0.0000000e+00 apart, marking (T0291)Y226.C as missing
WARNING: atoms too close: (T0291)Y226.N and (T0291)G227.N only 0.0000000e+00 apart, marking (T0291)Y226.N as missing
WARNING: atoms too close: (T0291)Y226.CA and (T0291)G227.CA only 0.0000000e+00 apart, marking (T0291)Y226.CA as missing

I have resubmitted the models.

Wed Jun 14 10:08:42 PDT 2006 Kevin Karplus

Solution released as 2gsfA.

Wed Jun 14 17:04:13 PDT 2006 Kevin Karplus

Our model1 does not do too bad on this one, though CIRCLE_TS1,
FAM_TS5, and RAPTOR-ACE_TS1 beat it.  (That's with my evaluation,
which is weighted on contacts and rmsd---with GDT our model looks much
worse: 83.1% vs. Zhang-Server_TS1 at 91.7%.)  With all-atom RMSD, the
order is CIRCLE_TS1, FAMS_TS5, try3-opt1...model1

How does Zhang-Server_TS1 get such a good GDT with such a crummy rmsd?
They really messed up the C-terminal end, but may have done better on
one of the internal loops.

The most difficult loop turned out to be disordered.

Fri Jul 14 11:18:48 PDT 2006 Kevin Karplus

On the improved evaluation in evaluate.unconstrained.pretty, the best
model is now CIRCLE_TS1=FAM_TS5 (-1.4504)
Of the TS1 models, SAM_T06_server is 20 out of 53 (-0.6726, though
	scwrling it would have helped a tiny bit -0.6731)
Our best model is try3-opt2.gromacs0 (-1.2665)
Our best submitted is model1 (-1.2431)

Our server did rather poorly on this target---why?