Mon May 15 09:07:37 PDT 2006
T0285
Make started Mon May 15 09:09:17 PDT 2006
Running on orcas.cse.ucsc.edu

Mon May 15 10:45:59 PDT 2006 Kevin Karplus

This looks like a new-fold (or remote homology) target.

Mon May 15 12:00:12 PDT 2006 Kevin Karplus

The evalue for the best hit is around 20 and the top hits are all to
different folds, so there really doesn't seem to be much chance of
getting a decent alignment.

Mon May 15 17:01:46 PDT 2006 Kevin Karplus

Due to a typo in the Make.main file, the rr constraints were not
computed before try1 was done, so the constraints were not used in try1.

Mon May 15 17:26:33 PDT 2006 Kevin Karplus

Although there were no strong hits (not surprising, since T0285 is an
ORFan), the try1-opt2 structure does not look bad.  Both secondary
structure and burial look pretty good.  There is really no point to
using the residue-residue predictions, since there is no mutual
information signal, and separation plus propensity is not very
interesting.

None of the top 5 alignments seemed to have anything interesting, so
it is pretty amazing that undertaker managed to come up with anything
reasonable.   I should probably do another run with a slightly tweaked
cost function, to see whether this fold is reliably generated.

I'll start this try2 run on orcas.

Mon May 15 20:11:44 PDT 2006 Kevin Karplus

I ran VAST on the try1-opt2 structure to see where the structure came
from.
There are matches to
1skoB	d.110.7.1
1f5mB	d.110.2.1
1vetA	d.110.7.1
1mc0A	d.110.2.1
1stzA	d.110.2.3
1ojGA	d.110.6.1
1j3wB	d.110.7.1
1vhmA	d.110.2.1

The 1f5mA hit is 20 on T0285.best-scores.rdb---none of the higher hits
are annotated as being fold d.110.

I wonder it I should manually select some of the d.110 folds and do a
run using just them.

(Note: try2-opt2 is coming out fairly similar to try1-opt2, so I am
encouraged that this is the fold undertaker wants to make.)

Mon May 15 20:35:16 PDT 2006 Kevin Karplus

I looked for the d.110 folds found in any of the rdb files with
	grep 'd[.]110[.]' *.rdb | sort -g +2
I then defined MANUAL_TOP_HITS in the Makefile to be the list
  MANUAL_TOP_HITS := 1f5mA 1p0zA 1skoA 1j3wA 1vetA 1acf 1stzA 1mc0A
and ran
	make extra_alignments
	make read_alignments
(This has to be done in two separate makes, since some of the
directories needed for read_alignments don't exist until
extra_alignments has created them.)

I then created try3.under to read these alignments first (but I still
included all-align.a2m).

The try3.costfcn includes sheet and helix constraints from the try1-opt2
and
try2-opt2 runs---this may be a mistake, since it won't allow other
alignments to be tried.  I may do another run without the constraints.

Tue May 16 07:28:25 PDT 2006 Kevin Karplus

The try3 run scores slightly better with the try3 costfcn, gaining on
hbonds and sidechains over try1, but losing on packing terms.

For try4, I'll use just the favored alignments and no constraints.
After that, I'll probably try polishing all existing models
(unconstrained).

It might be a good idea to reduce the weight on sidechain cost when
doing the intial packing--this term is currently the biggest
constributor to the differences in cost between models.
I did not reduce sidechain cost for try4.

Tue May 16 09:36:50 PDT 2006 Kevin Karplus

The try4 run did not do quite as well as the try3 run (which scores
best with the try1,3, and 4 cost functions).

I'll do try5 with a reduced sidechain cost.
Then I'll do a polishing run with sidechain back up, and soft_clashes
and break costs increased.

Tue May 16 11:28:34 PDT 2006 Kevin Karplus

The try5 run found a different solution, one that changes the
c-terminal helix into a strand and makes two n-terminal helices.
This one scores well on hbonds and packing terms, but loses a bit on
predicted alpha, on bys, and on sidechains.

try5-opt2 scores best on the try1, try4, and try5 cost functions, but
not on try2 and try3, mainly because of the constraints.

Thu May 25 14:53:02 PDT 2006 Kevin Karplus


I'm worried a bit about the strand for the predicted helix R112-E122.
Perhaps we need another run with strong helix constraint.


Sun May 28 17:17:51 PDT 2006 Kevin Karplus

I scored the server models as well as ours with try5 (with the
recently fixed undertaker that handles pdb files with the ^M
characters at the end).

The best-scoring model is SAM_T06_server-TS1.
The best-scoring model that isn't ours in ROBETTA_TS5.
We do better on hbonds, they do better on sidechains, n_ca_c, and
bad_peptide.

SAM_T06_server-TS1 is a simple beta sheet with an alpha helix packed
on either side.  There is a bit of alpha helix near the N-terminus
that looks bogus (extend the first strand back to P8).  Otherwise the
model looks pretty good (if we ignore secondary structure predictions).
It probably scores so well because of how much sheet it makes, even
though the packing is a bit loose and the secondary structure is not a
great fit.

Our predictions (from the server and from the tries in decoys/) are
not in agreement, and they don't agree with Robetta either.  Should we
polish up some of the better models?  Which models will be worth
submitting later on?


Wed May 31 14:00:03 PDT 2006 Kevin Karplus

I'm starting a "polishing" run (try6 on camano) that will start with
all the different models (including server models) and try improving
them without constraints.  This will probably mainly polish the
SAM_T06_server model, but I've turned CrossOver up in the hopes of
picking up good things from elsewhere.

I think we will have 3 or 4 distinct sheets to submit for this target.

Wed May 31 14:32:15 PDT 2006 Kevin Karplus

The try6 polishing run is going very slowly, because of the number of
models in the conformation pool.  It seems like this run is just
polishing the SAM_T06_server model (which is ok, since we can then
submit it without duplicating a model from a server).

We might want to do a polishing run from ROBETTA_TS5 (without all the
other server models nor the better-scoring undertaker models), so that
we can reasonably submit that also. I'm starting that as try7 on camano.

Wed May 31 17:24:48 PDT 2006 Kevin Karplus

try7 finished before try6 (smaller conformation pools) and is our new
best-scoring model (though try6 may beat it in the end).

Foo---try7 is *not* a polishing of the Robetta models, since they
failed to be read.  It is just recreated from alignment insertion (of
1p0zA) into a random conformation.  It looks very similar to try5-opt2.

Let me try again for try8, getting the file names right this time.


Wed May 31 18:32:37 PDT 2006 Kevin Karplus

try8 scores almost as well as try7 (better on the try1 costfcn).

I'll submit 5 models for now, with the option of replacing them later:
ReadConformPDB T0285.try7-opt2.pdb
ReadConformPDB T0285.try8-opt2.pdb
ReadConformPDB servers/SAM_T06_server_TS1.pdb
ReadConformPDB T0285.try3-opt2.pdb
ReadConformPDB T0285.try2-opt1.pdb

Thu Jun  1 07:15:20 PDT 2006 Kevin Karplus

try6-opt2 (based on the SAM_T06_server model) is the new best scorer,
so I'll rearrange the models to


ReadConformPDB T0285.try6-opt2.pdb
ReadConformPDB T0285.try7-opt2.pdb
ReadConformPDB T0285.try8-opt2.pdb
ReadConformPDB T0285.try3-opt2.pdb
ReadConformPDB T0285.try1-opt2.pdb

Thu Jun  1 07:22:10 PDT 2006 Kevin Karplus

I've sent this improved list, but try3-opt2 and try1-opt2 are too
similar to each other. It would be good to get yet another fold.

It would also be good to do a polishing run with breaks and
soft-clashes turned up, as we still have a number of conflicts even in
the top-scoring models.

Thu Jun 22 14:44:53 PDT 2006 Kevin Karplus

try9 polishing run started on the farm cluster.  It will probably
just polish up try8-opt2.  We may have to do other runs to polish the
other models.

Thu Jun 22 16:59:25 PDT 2006 George Shackelford

As I take a look at this protein and examine the ehl2, I am convinced we
haven't got this right. There is an interesting symmetry in the ehl2
around the midpoint (ok, more like 78, but close). I am going to look
about and see if I can find a better match to the secondary structure.

Fri Jun 23 13:36:36 PDT 2006 Kevin Karplus

try9-opt2 is just a polishing of try8-opt2 (from ROBETTA_TS5), as
expected.

Fri Jun 23 22:37:51 PDT 2006 George Shackelford

I did a search for chains that match well to the best composite ehl2.rdb
file. I checked what came up and selected the following:
 1z54A 1e7kA 1s28A 1dbfA 1dytA 1xbiA
I include these in the manual top hits, and get alignments. I have put
them into try10 by themselves, scaled rr.constraints by .2, and started
the new try.

Just an effort to get something that agrees with ehl2.

try10 running on lopez

Sat Jun 24 01:53:30 PDT 2006

1z54A was the first and best choice for alignment. The results scored
even better than the polished try9. This one is worth polishing some
more and getting a better scoring; the 'breaks' in it could be improved.

I'm commenting 1z54A out and trying to see what I can get next.

try11 running on peep.

I ended up stopping try11 before it got to opt2. The results of opt1
were so poor (it was basicall a mangled bunch of helices) thate were
scoring badly. These results were based on 1e7kA. So I commented it
out and went on to see what would happen next.

try12 running on peep.

Try12 actually looks decent but it has some bad breaks. It may be
useful to do a run that tries to close those gaps and see if we get
a usable results.

I think I'll start a "polishing" run...

Sat Jun 24 08:15:02 PDT 2006 Kevin Karplus

With the unconstrained costfcn, the order is now
	try6, try7, try9, try8, try10, try3, try5, try1
With the try1 costfcn, the order is now
	try9, try8, try6, try1, try3, try2, try7
With the try12 costfcn, the order is now
	try10, try9, try1, try3, try8, try2, try4

I'll superimpose some of these and see what looks promising.

Sat Jun 24 08:26:29 PDT 2006 Kevin Karplus

I looked at try6, try7, try10, try1, try3, try9

I don't care much for try6 and try7---too poor a match to secondary
structure prediction.  The others look ok, but try1 and try3 are very
similar.   That leaves us with
	try1, try10, try9
as our plausible submissions.

try6 and try7 are our next best, with try5 too similar to try7 to be a
separate submission.

Sat Jun 24 14:42:57 PDT 2006 Kevin Karplus

George has still got something wrong with his .cshrc file so that his
attempts to create grep-best-rosetta don't work.  I created it and see
that try4 and try10 are the models rosetta best likes, though it
really hates all the backbones.

Sat Jun 24 18:39:25 PDT 2006 Kevin Karplus

I'm starting try14 on cheep with all the alignments (including the
ones George added) and a cost function that has the secondary
structure constraints but no others.  I've also turned up the weight
of the hbond_geom_beta and hbond_geom_beta_pair, to favor alignments
that form sheets.

Sat Jun 24 20:26:52 PDT 2006 Kevin Karplus

try14-opt2 got the best score with the try14 cost function, apparently
based on 1z54A.  It does OK with the unconstrained costfcn, coming
after try6, try7, try9.

The resulting model is quite similar to the try10 model, but scores a
bit better with our cost functions.

Sun Jun 25 08:22:25 PDT 2006 Kevin Karplus

The other two alignments that try14 seriously considered were to 1v8fA
and 2cyeA, so I will try a run with just those two.
On second thought, let me toss in some other templates that were
considered but not used in other runs:

1tu1A, 1jyhA, 1fjrA from try1 (which ended up with 1f5mA)
1v8fA, 1iq3A from try2 (which ended up with 1f5mA)
try3, try4 only used 1f5mA
1stzA from try5 (which ended up with 1p0zA)
try6 worked from SAM_T06_server_TS1 model
try7 ended up with 1p0zA
try8 worked from ROBETTA_TS5
try9 worked from exiting models (mainly try8)
try10 worked from 1z54A
try11 worked from 1e7kA
try12,try13 worked from 1dytA
1v8fa, 2cyeA from try14 (ended up with 1z54A)

So the "extras" to try are 1v8fA, 2cyeA, 1tu1A, 1jyhA, 1fjrA, 1iq3A, 1stzA

Sun Jun 25 08:36:22 PDT 2006 Kevin Karplus

try15 started on cheep.

Sun Jun 25 08:42:15 PDT 2006 Kevin Karplus

Oops, forgot to update MANUAL_TOP_HITS,  make extra_alignments, and make read_alignments.
try15 restarted on cheep.

Sun Jun 25 09:53:07 PDT 2006 Kevin Karplus

RATS, I let in the all-align for try15, so it ended up with 1z54A
again, but didn't even score as well as try14-opt2.

I'll start try16 without the all-align.

Sun Jun 25 09:58:45 PDT 2006 Kevin Karplus

try16 started on cheep.

Sun Jun 25 10:11:14 PDT 2006 Kevin Karplus

OK, try16 is using 2cyeA

Sun Jun 25 11:05:05 PDT 2006 Kevin Karplus

Indeed, try16 comes out second to try14 with the try16 costfcn.
Although they are based on different templates, they are clearly from
the same superfamily, and the alignments of the sheets are similar,
but not identical.

Since this target has a hard deadline tomorrow, and George does not
seem to be working on it this weekend, I'll have to make a decision
about what to submit.

I think I'll drop the one we polished from Robetta---it may be great,
but we didn't really create it.

That leaves me with
	try14-opt2	1z54A
	try16-opt2	2cyeA
	try1-opt2	1f5mA
	try6-opt2	SAM_T06_server_TS1
	try7-opt2	1p0zA

Sun Jun 25 11:23:48 PDT 2006 Kevin Karplus

OK, I'm giving up.  I've submitted those models.
If George (or anyone), does more work on this target, e-mail me so I
can resubmit.

Sat Jun 24 21:36:52 PDT 2006 George Shackelford

try14 looks great but it's wrong when you look at burial. I'm going to
force a run using 1s28A. Unlikely except if forms a dimer.

Actually I'm doing a run with 1s28A and 1xbiA. At the moment, 1s28A is
coming out on top.

Try17 running on peep.

I did another search for possible ehl2 matches. These are possible so
I'm doing a manual top hits and a run consisting of these only.  1h1hA
1quqB 2fm8A 1xs0A I've cranked up constraints to 20 to ensure that we
get something that we can like. I consider these to be a further hit
than I got with 1z54A, but I wish I could factor in the near-backbone
and/or burial prediction as well.

Try18 running on shaw.

Sun Jun 25 16:22:03 PDT 2006 George Shackelford

Try17 didn't do so hot, but try18 looks like a winner. Scores up near
try16 and I haven't done a polishing run. Think I'll do that and
revisit 1xbiA. Might just do a run alone with that.

try19 running on shaw

Actually I am going to revisit 1xbiA. I'm going to do a run with it
alone. I have no idea if it will get anything new, just trying...

try20 running on peep

Sun Jun 25 18:34:31 PDT 2006

Damm. I didn't change the tries in the try20.under from try18. try20
has generated try18 files. That not only throws out the try18 results,
but may have badly affected the try19 polishing of try18!! Now I'll
need to redo try18, try19, and try20. at least I can run try18 and
try20 simultaously. So be it!

Try20 restarted on peep try18 restarted on shaw

Sun Jun 25 22:09:18 PDT 2006 George Shackelford

Well, restarting did not seem to work correctly. The restart of try18
resulted in trash. I decided to redo try18 as try21 and try20 as
try22. I'll have to inform others that I had this problem. Kevin may
know what I should have done and others need to know as well.

Because of the restarts, I have not made any progress on finding other
ehl2 matches. I also need to change the transition value to log
odds. I also should get a second approach based on the actual sequence
by reducing the amino acid alphabet. But which reduction?? The eight
letter one? (I need reference here).


Mon Jun 26 07:42:31 PDT 2006 Kevin Karplus

George set me e-mail to look at try19.

He has never explained (in README files or elsewhere) *how* he is
doing searches based just on ehl information.  I can think of a few
ways, but I suspect he is doing something completely different.
George, write up a couple of paragraphs on it for the main CASP7/README

I'm not sure which reduction of the amino-acid alphabet George is
thinking of, nor why he believes it will help---the Dirichlet-mixture
regularizer already does a good job of generalizing to other amino
acids, better than any reduced alphabet would.

With the try16 costfcn (with constraints from the 2ry prediction)
try19-opt2 scores adequately, after try14, try16, try15, try10.
It forms a nice sheet, but the helix packing against the sheet is terrible.
Still, we do have a near duplicate in our current set of models, so I
can bump one of them to make room for try19.

George has still not fixed his .cshrc file so that he can run the
sort-by-rosetta script from a makefile, so I'll have to remake
grep-best-rosetta.  Naturally, rosetta hates all the models, but
try16-opt2 is the one it hates least to repack.


I'll resubmit
ReadConformPDB T0285.try16-opt2.pdb
ReadConformPDB T0285.try1-opt2.pdb
ReadConformPDB T0285.try6-opt2.pdb	# from SAM_T06_server
ReadConformPDB T0285.try7-opt2.pdb
ReadConformPDB T0285.try19-opt2.pdb

Make started Tue Jul 18 09:50:03 PDT 2006
Running on lopez.cse.ucsc.edu

Because of a typo, I accidentally redid the make for this target.
This created a number of local structure predictions that were not
available at the time we did the submission.   Because we had no
strong hits before, this may change which templates come out on top.

Tue Jul 18 22:41:56 PDT 2006 Kevin Karplus

NOTE: superimpose-best.under was overwritten by the accidental make.
It will have to be re-created if best-models.pdb.gz ever needs to
reflect the submitted models.