Thu May 18 09:57:00 PDT 2006
T0293
Make started Thu May 18 09:59:20 PDT 2006
Running on shaw

Thu May 18 14:52:13 PDT 2006 Kevin Karplus

The t2k alignment finds 3 PDB files:
	1nv8A
	1sg9A
	1vq1A
(all hemK)

The t04 alignment finds 10 PDB files.
The t06 alignment finds 44 PDB files.

There are at least 100 good templates in the superfamily.
It looks like the family is c.66.1.30

The top templates look like
	1t43A
	2b3tA
	1sg9A
	1vq1A
	1nv8A

Make started Thu May 18 16:01:18 PDT 2006
Running on orcas.cse.ucsc.edu

I killed the jobs on shaw and restarted them on orcas, because shaw
seemed to be thrashing in muscle.  (I also removed the muscle
alignments from the pairwise alignment targets.)

Thu May 18 20:00:10 PDT 2006 Kevin Karplus

This looks like fold recognition with easy-to-find templates, as BLAST
just misses getting significant hits:
T0293	1vq1A	30.00	80	50	3	67	146	139	212	0.371	30.42
T0293	1sg9A	30.00	80	50	3	67	146	127	200	0.371	30.42
T0293	1nv8A	30.00	80	50	3	67	146	129	202	0.371	30.42
T0293	1s6yA	22.33	103	73	2	48	143	223	325	  7.0	26.18

Make started Fri May 19 09:03:20 PDT 2006
Running on lopez.cse.ucsc.edu

The undertaker run seems to have died (or been killed) without an
error message in the try1.log file, so I'm running the make again.

Fri May 19 15:43:00 PDT 2006 Kevin Karplus

The try1 model seems to be coming from 1ej0A, though there may be
assembly of pieces from several templates:
	1nv8A, 1sqgA, 2as0A, 1h1dA, 1ws6A, 1ej0A
The e-value seems to keep getting worse as later alignments are chosen.

This looks like a case where the good alignments were damaged by undertaker.

The top templates for T0293.best-scores.rdb overlap at least partially
with the top blast hits.
1nv8A	284	5.3000e-31	8.0162e-14		c.66.1.30	86224
1dusA	195	6.8200e-21	1.4625e-13	1dusA	c.66.1.4	34182
1jg1A	216	4.0500e-19	1.3858e-10		c.66.1.7	66661
1o54A	266	7.3000e-21	1.5390e-10		c.66.1.13	92482
1dl5A	318	5.2500e-20	2.4539e-10	1dl5A	c.66.1.7,d.197.1.1	64720,64721
1f3lA	322	5.4900e-20	3.0416e-10	1g6q2	c.66.1.6	59630
1im8A	226	2.1200e-20	1.2693e-09		c.66.1.14	66212
1i9gA	265	7.3300e-19	2.2275e-09		c.66.1.13	62090

Perhaps we should limit the set of alignments to choose from---perhaps
only the top 10 or 20 hits with the HMMs?


Thu May 25 15:08:11 PDT 2006 Kevin Karplus

Need to look at templates and try to figure out what to do on this one.

T0293

Do a run with just the top few templates and blast hits.

Sun Jun  4 12:19:44 PDT 2006 Kevin Karplus

There seems to be a large N-terminal region that is not matched by the templates.
We might want to do two overlapping subdomains V1-E49 and D28-D250.

I started both of these running on lopez.

Sun Jun  4 18:23:43 PDT 2006 Kevin Karplus

OOPS.  I forgot to check which split-into-domains was being
called---this Makefile still had the old one, so the try1 runs were
being done with the old methods from pcpe/starter-directory.
I'll fix the Makefiles and rerun make (after moving try1 into a new directory).

The V1-E49 run had completed, so I looked it.  It is predicting an
alpha-beta domain instead of the all-alpha stuff we saw before.  I'm
not sure I believe it though.

Sun Jun  4 21:41:02 PDT 2006 Kevin Karplus

I reran V1-E49, and the new prediction (also an alpha-beta domain)
looks a bit more convincing---at least it ends with a helix, so may be
able to link up with the second domain.

Mon Jun  5 16:25:14 PDT 2006 Kevin Karplus

The D28-D250 part was rerun also.
George just fixed the handling of residue-residue predictions for
chains that don't start with residue 1, so I remade the rr predictions
for D28-D250.  This did not change the results of constraints, but did
eliminate the error messages.

Mon Jun  5 17:08:51 PDT 2006 Kevin Karplus

I tried making a chimera of try1-opt2 from V1-E49 and D28-D250,
crossing over at P39.  This chimera1 has terrible clashes, but may be
worth trying to optimize.

I'll try this as try2 on camano.

Tue Jun  6 12:16:24 PDT 2006 Kevin Karplus

try2-opt2 has done a decent job of fixing up the chimera.
Clashes and breaks are definitely reduced relative to try1-opt2.
Constraints are not well met, probably because they are inconsistent.

try2-opt2 also scores best with the unconstrained costfcn.

For try3, I should probably polish either without constraints, or with
constraints taken from try2-opt2.

(Hmm, trying to score all the server models crashed undertaker again.
This time by CaspIta-FOX_TS1)

Tue Jun  6 13:20:10 PDT 2006 Kevin Karplus

After commenting out the CaspIta-FOX models, I couls score all the
servers with the unconstrained costfcn.  try2-opt2 still scores best,
and SAM_T06_server is the top server model, followed by ROBETTA_TS5
Pmodeller6_TS1, ROBETTA_TS4, ...

Tue Jun  6 13:41:22 PDT 2006 Kevin Karplus

Looking at the superposition of many models, including ours and some
of the server models, I'm not so pleased with try2-opt2.  It has moved
too far from the models from alignments.  (It is a little hard to
tell, since the superposition is poor---it is getting trapped by the
differences in the N-terminal residues.)  Wait, that't not the
problem---I was starting by a superposition of P39.  I've changed to
using a pretty much unchanged portion (G65-G85) to initialize the superposition.

Tue Jun  6 13:54:52 PDT 2006 Kevin Karplus

Now the models superimpose well, but the two problem areas are the
N-terminal region (for which the SAM_T06_server model looks most
promising) and N143-G181.  The loops produced by the servers in this
region look pretty trashy.


Wed Jun  7 17:27:05 PDT 2006 Kevin Karplus

Looking at the superposition of several models, I see some problems
with some of the models around residue E185.  try1-opt2, try2-opt2,
D28-D250/try1-opt2, and the third alignment from undetaker-align.pdb
(to 1jg1A) seem to be misaligned.

I should probably remove any sheet constraints on this region that
come from 1jg1A and reoptimize.

Incidentally C140, C142, and C206 all cluster, so may form a
metal-binding site, though there is no corresponding site in the 
templates and none of these is highly conserved (C140 is somewhat
conserved in t2k, but IVTL are all more common).

Wed Jun  7 18:32:11 PDT 2006 Kevin Karplus

I've created a try3.costfcn and will do a try3 run from the alignments
(including the subdomain alignments) on shaw.

If it does a decent job, I'll submit try3-opt2 and try2-opt2 tomorrow
morning, do some polishing and have the final submission ready by
Monday, when it is due.

Wed Jun  7 22:08:25 PDT 2006 Kevin Karplus

try3-opt2 scores worse than try1-opt2 and try2-opt2.

The problem is that the hairpin at T230-W246 has gotten detached from
the sheet.  It seems to be attached properly in SAM_T06_server_TS1, so
I'll pick up sheet constraints from there for try4.

Thu Jun  8 09:20:47 PDT 2006 Kevin Karplus

try4-opt2 is better, but still needs some work.
In particular, I suspect that the predicted N-terminal strands may be
part of the big sheet, perhaps adjacent to the C-terminal strands.

Still, there is a soft deadline this morning, so I'll make a submission.


Thu Jun  8 17:37:34 PDT 2006 Kevin Karplus

Firas used ProteinShop to fix up the N-terminus where I thought it
ought to go (decoys/fromTry4.proto.pdb).

He and I then made up sheet constraints (partly borrowed from
servers/SAM_T06_server_TS1) to fix up the bulgy edge strand and
connect to the newly added piece.  We are trying to polish it up as
try5 on the farm cluster.

If it doesn't clean up the bulgy sheet, we may want to do a
cut-and-paste from servers/SAM_T06_server_TS1 to make a chimera to optimize.

Fri Jun  9 08:48:05 PDT 2006 Kevin Karplus

try5-opt2 did not clean up the bulgy strand.
Looking at ROBETTA_TS5 and the second undertaker-align model, I see
that try5-opt2 may have the strand upside down.  I'll try cutting and
pasting in a strand from the second  undertaker-align model.

I made a chimera of try5-opt2 and the second model in
undertaker-align, copying in V226-C235.  The breaks are bad, but they
should be fixable, especially if I get the sheet constraints right.

Oops, that chimera has a bulgy strand, which will be difficult to line
up with.  Let's try  making a chimera with the SAM_T06_server_TS1
model, copying V229-Q236

Now we need to get the sheet constraints right.

From SAM_T06_server_TS1.sheets:
SheetConstraint (T0293)W203 (T0293)G209	(T0293)S247 (T0293)R241	hbond (T0293)Y204	1
SheetConstraint (T0293)V229 (T0293)C235	(T0293)W246 (T0293)M240	hbond (T0293)T230	10

Added by hand
SheetConstraint  T230	F234	E32	P36		hbond Y231		30


Fri Jun  9 11:32:53 PDT 2006 Kevin Karplus

try6-opt1 gets a new best score on the try6 costfcn, so it seems that
we can get a decent model out the chimera.  There are still some
problems with very bad breaks before R238, G237, and V226, which do
not seem to closing in the opt2 part of the run.

Other little problems: K59 seems to be in the way of V1 and F5 packing
in closely.  L34 may need to be closer to L51.  The buried residues
around G69, I75, V93, F147, F186 seem a bit too exposed---as if the
protein had flexed open a bit.

Also the inserted loop from K154 to G181 is probably all junk.

Fri Jun  9 12:43:43 PDT 2006 Kevin Karplus

Sure enough, try6-opt2 did no further reduction of the breaks.
Even more annoyingly, it did not get the  sheet constraints for
V229-Q236 fully satisfied.

I'll try another run with breaks turned up and the unsatisfied
constraints strengthened.

Interestingly, the try6-opt2.repack-nonPC model has significantly
poorer satisfaction of the constraints, despite an identical backbone.
I think that undertaker may have picked some rather "off" rotamers
that put CB is funny places to try to satisfy the constraints.

I'll try the next optimization from the rosetta-repacked model, to
avoid the bad CB positioning.

Fri Jun  9 17:10:54 PDT 2006 Kevin Karplus

try7-opt2 has closed the worst breaks---the following are the worst
ones left:
	T0293.try7-opt2.pdb.gz breaks before (T0293)E21 with cost 1.43015
	T0293.try7-opt2.pdb.gz breaks before (T0293)G209 with cost 1.07871
	T0293.try7-opt2.pdb.gz breaks before (T0293)T232 with cost 0.798814
	T0293.try7-opt2.pdb.gz breaks before (T0293)R241 with cost 0.583598

There are still some problems getting the beta sheet to be well
formed, because the strand containing T232 is offset a bit from where
it should be relative to L244.  Sliding the sheet containg the N-terminus and the
strand with T232 over about 1 residue so the H-bonds line up right
would probably fix the problem.

Perhaps Firas could do that with ProteinShop?
Moving the first helix to reduce the exposed hydrophobics might also
be easier with ProteinShop.

In the meantime, I'll try really upping the Hbond constraints for the
hbonds I want, to see if undertaker will do it if forced.
(Running as try8 on cheep.)

Fri Jun  9 20:45:56 PDT 2006 Kevin Karplus

try8 closed gaps a little more, but did not improve the constraints much.

Sun Jun 11 15:09:10 PDT 2006 Firas Khatib

I tried moving the T232 sheet over and saved it 4 times along the way to do
minimal changes (and therefore minimal damages):
shift232over1FromTry8.renum.pdb 	= try9  (running on orcas)
shift232over1.1FromTry8.renum.pdb 	= try10 (running on orcas)
shift232over1.2FromTry8.renum.pdb 	= try11 (running on whidbey)
shift232over1.2andMoveH33-37.renum.pdb 	= try12 (running on whidbey)

It would be nice if we could get Proteinshop to draw hydrogen bonds since with
Proteinshop you can see the hydrogen bond sites and hydrogen cages!

For now, I moved it in baby steps and will run 4 tries to see if any of them help.

I will move the first helix afterwards.

I did not want to move the sheet containing the N-terminus if I don't have to 
because it has 6 hydrogen bonds already and I don't want to break those!

Sun Jun 11 17:41:10 PDT 2006 Firas Khatib

I used Proteinshop while the 4 tries were running to modify the T232 sheet into a 
full sheet (since you can change residue ss assignments in Proteinshop) and I was 
able to make 6 hydrogens bonds between the T232 strand and the L244 strand!
I lost a few hbonds with the 33-37 strand but will try to fix that next.

This new Proteinshop attempt is: shift232over2FromTry8.renum.pdb 

I am running this as try13 on shaw.

Sun Jun 11 20:20:06 PDT 2006 Kevin Karplus

All the new runs seem to be making good progress, but it is harder to
say which one will end up the best.  Currently, try8-opt2 still scores
best, but try11-opt1 and try10-opt1 are doing well enough that they
may be able to compete.

I'll probably want to do a polishing run from all models (at least all
the new models) when the current runs have finished.

Sun Jun 11 21:55:13 PDT 2006 Kevin Karplus

Unfortunately, try13 has "try9" inside it everywhere, so whatever it
did was either stepped on by try9 or is now called try9.
Sigh, and that was supposed to be the most promising one.

I will do the global replace in try13.under and run it again, perhaps
with slightly shorter optimizations, so that there will be time for a
polishing run.

Actually, I don't think it will fit the costfcn very well, since that
is requesting that T230 be aligned with A245, but Firas has slid it
the other way to align with S247.

I suppose that we could try that alignment also, though I don't like
how it sticks V229 out.
I made a try14-costfcn that has the alignment Firas was trying to
make, and it scores try9-opt2 best (which may actually have come from
the mislabeled try13 run).  

I will do another optimization run with try14 (though I don't really
believe in it), and submit try8 and the best model that comes out with
Firas's alignment.

Sun Jun 11 22:20:58 PDT 2006 Firas Khatib

wow... it seems I have totally botched the last hopes for this target.

There wasn't enough slack on the 225-228 end of the strand, which is why
I shifted the strand the other way, since there was a lot more slack from
234-241, but I see what you mean about the V229.

Sun Jun 11 23:33:54 PDT 2006 Firas Khatib

I tried to line up T230 and A245 as well as rotate the N-terminal helix to 
reduce the exposed hydrophobics.

decoys/lineup230and245take1.2rotateNterminusAndMove.renum.pdb

I also tried to slide the 33-37 helix back (since I moved the 230-245 one)
but this was more difficult:

decoys/lineup230and245take1.9rotateNtermSlide33-37helix.renum.pdb

I will run this one as try15, in case it is better.

and I will run lineup230and245take1.2rotateNterminusAndMove.renum.pdb as try16
which I hope will be the best!

Mon Jun 12 00:46:28 PDT 2006 Kevin Karplus

try14-opt2 is based on try9-opt2,  but does not really satisfy the
sheet constraints for either the try14 or the try9-13,try15-16 cost
functions. I should probably try the optimization with this alignment
over again with just the 
ReadConformPDB shift232over2FromTry8.renum.pdb
starting point, since optimizing from all was not very successful in
making a clean sheet. (running this as try17 on cheep)

So in the morning, I'll have to choose between whatever scores best
with try16.costfcn and whatever scores best with try17.costfcn,
perhaps doing a final polishing pass on that.  I should probably
submit as model 2 whatever scores best with the other costfcn, as I'm
not that certain of the correct alignment.


Mon Jun 12 06:08:13 PDT 2006 Kevin Karplus

I don't like the way try15-opt2 has swung out the initial helix, so
discard that one.

try17-opt2 does a better job of covering predicted buried residues
than try14-opt2, so I prefer it, even though try14-opt2 has slightly
better H-bonds.  Actually, the difference is *which* hbonds are
present for strand T230-G237, since neither model manages to get the
H-bonds on both sides.

I will submit
ReadConformPDB T0293.try17-opt2.pdb
ReadConformPDB T0293.try16-opt2.pdb
ReadConformPDB T0293.try8-opt2.pdb
ReadConformPDB T0293.try4-opt2.pdb

ReadConformPDB T0293.undertaker-align.pdb model 1	(from 1nv8A)

Mon Jun 12 06:43:34 PDT 2006 Kevin Karplus

submitted.

Wed Jun 14 10:09:41 PDT 2006 Kevin Karplus

Solution released as 2h00A.

Wed Jun 14 14:34:42 PDT 2006 Kevin Karplus

Foo! we did *not* do well on this one, AND the server did better than
we did by hand.  Our best model (try15-opt2) is not one we submitted,
because I didn't like it. The model from SAM_T06_server_TS1 was better
than any of our hand submissions.

So much for my understanding of protein structure.

The N-terminus was not at all like what we got from handling it as a
subdomain. The big insertion that we never touched (between F151 and
G177) was disordered anyway, so it was just as well we didn't fuss
with the model there.

The best server model was Zhang-Server_TS1, but we would have done
well to copy the best-scoring  server with the unconstrained costfcn
(other than ours), ROBETTA_TS5, which scored 6th among the servers
with the evaluation function I'm using and best with GDT.

Using just GDT, SAM-T02_AL5 is the best of our servers, but is still
not very good.


Fri Jul 14 11:38:38 PDT 2006 Kevin Karplus

Using the improved evaluation in evaluate.unconstrained.pretty, the
SAM_T06 server is 30th of 53 TS1 models from servers---pretty feeble!
(real-cost 0.24, while ROBETTA_TS5 is -0.24, and Zhang-Server_TS1 is -0.21)

Our best model was try15-opt2 (0.23) and our best submitted model was
model4 (0.28).  We did worse by hand than the median server!

The sheet is more twisted and curled than we made it.