Mon Jun 26 09:45:53 PDT 2006
T0344
Make started Mon Jun 26 09:47:37 PDT 2006
Running on lopez.cse.ucsc.edu

Mon Jun 26 10:04:34 PDT 2006 Kevin Karplus

BLAST gets no strong hits in PDB.  Best is 1mpyA (26% over 81
residues, E-value 0.6).

Worse than that---this sequence is basically an ORFan, with only the
orthologs in other Pyrococcus species found in NR.  Oops, not quite,
there is also a Thermococcus kodakarensis copy of the protein.
Still, the diversity of sequences is very small, so we don't have much
evolutionary signal to use.

Mon Jun 26 22:02:25 PDT 2006 Kevin Karplus

The HMMs liked 1z47A best (E-value 0.8) and 1l2tA next (E-value 0.8).

try1-opt2 seems to be based on 1z47A, but is absolutely *terrible*.
The hairpin from G197-P213 (probably the most useful scrap of the
alignment) was discarded.

The alignment to 1l2tA looks a bit more promising.

This one is going to take a lot of work to convince the scraps to
assemble into a fold.


Wed Jul 12 14:39:25 PDT 2006 Chris Wong

I've noticed this protein has 4 tryptophan residues in the multiple
alignments.

I am setting up a try2 using T0344.t06 str2 sequence logo to identify
where I want some helix and sheet constraints.

( make -k T0344.do2 > & do2.log ; gzip -9f do2.log ) & started on
lopez 3pm.

Wed Jul 12 15:07:25 PDT 2006 Chris Wong

killed job due to errors in sheet constraints.  going to fix them now.

try2 restarted at 313pm.  oops, lopez is busy... looking for a free
processor.  Okay, running on peep @ 323pm.


Fri Jul 14 12:26:42 PDT 2006 Chris Wong

try2 does not look "protein-like".  It is poorly packed, with things
sticking out everywhere.  I think I'm going to try to get a little
more of the secondary structure formed before I focus on packing.
try3 adds some more helix and sheet constraint.  I used t06 str2 and
dssp-ehl2 to come up with those.  try3 started on orcas at about
1230p.


Fri Jul 14 15:44:13 PDT 2006 Chris Wong

try3 is missing the following helix: HelixConstraint R55 K71.  So, I'm
going to run a try4 with increased weight on the constraints to make
sure those helices form.

( make -k T0344.do4 > & do4.log ; gzip -9f do4.log ) &

started at 5p on ocas.

Sun Jul 16 07:25:14 PDT 2006 Kevin Karplus

Soft deadline this morning, and not much has been done.
No comments to tell me what to submit.

unconstrained costfcn prefers try3,try2,try1,try4
rosetta prefers try2, try1, try3, try4
try1 costfcn prefers try3,try1,try4, try2

try4 looks terrible.
try3 looks like it is trying to form something.

alignments 2 (to 1l2tA) and 5 (to 1fobA) may have found some useful
super-secondary structure.

I'll do a preliminary submission of
	try3
	try1
	try2
	align2 1lt2A
	align5 1fobA

Sun Jul 16 07:40:50 PDT 2006 Kevin Karplus

Preliminary submission done, but this target needs a lot of work.

There's not much conservation signal---we have an archeal ORFan here.
That means all predictions (local structure, residue-residue, and fold
recognition) will be poor.

There are no cysteines, so "maybe_metal" and "maybe_ssbond" can be
discarded.

Look at the server models, and perhaps extract some sheet constraints
for parts that look good.  All the best models seem to be coming from
robetta, so I fetched the 10 robetta models. (They look better than
any of *our* models.)

Extracting sheet constraints from robetta models might be useful.
I put together superimpose-robetta.under and extracted sheet and helix
constraints for the 10 robetta models.  There should probably be
optimization runs done with each set of constraints (perhaps
discarding constraints that look rather implausible).

If we can't assemble anything from just the constraints, we could try
polishing the robetta models.  I don't have time to work on this, but
I hope that Zack and Chris do.


Mon Jul 17 15:33:59 PDT 2006 Chris Wong

We will do some polishing runs on each set of Robetta constraints
(helices and sheets).

try5 -> robetta1 started on camano at 430p.
try6 -> robetta2 started on vashon at 445p.
try7 -> robetta3 started on shaw at 452p.
try8 -> robetta4 started on orcas at 519p.
try9 -> robetta5 started on orcas at 522p.
try10 -> robetta6 started on shaw at 523p.
try11 -> robetta7 started on vashon at 525p. Restarted on iris at 555p.
try12 -> robetta8 started on neigh at 539p.
try13 -> robetta9 started on orcas at 923p.
try14 -> robetta10 started on orcas at 923p.


Tue Jul 18 10:56:59 PDT 2006 Chris Wong

I've just taken all optimized robetta models from last night and
superimposed them along with the original robetta models.  The script
for this was superimpose-robetta.under.  The resulting file is
opt-robetta-models.pdb where the first 10 are the original robetta
models and the last 10 are the corresponding optimized robetta models.
I should note that try8-opt2 and try12-opt2 were the top scoring
optimized models (from score_all.try14.pretty).

I'm going to send a message to Prof. K to see what he thinks.

Tue Jul 18 11:21:16 PDT 2006 Kevin Karplus

None of the optimizations moved the robetta models much--they are
still essentially the same.  Several of them have reasonable sheet
fragments.  I don't have time to go through each one and select out
good ones.  They are all over the map in terms of overall fold.

It doesn't make any sense to use the try14 costfcn to distinguish
among the since it has helix and sheet constraints from robetta10.
Look at the score-all.unconstrained.pretty instead, or grep-best-rosetta.

Perhaps we should do a run with all the sheet constraints, to see if
we can get any of the sheets to form.

We might also want to redo the make, so as to get new o_notor2,
n_notor2, o_sep, and n_sep predictions.  They won't be great for an
orfan, but they might provide some suggested H-bonds that will help.

Make started Tue Jul 18 11:27:11 PDT 2006
Running on orcas.cse.ucsc.edu

Tue Jul 18 11:33:14 PDT 2006 Kevin Karplus

I'll try putting together a try15.costfcn, using all the sheet
constraints from the (optimized) robetta models.

Tue Jul 18 11:39:10 PDT 2006 Kevin Karplus

There are a few sheet constraints that don't work
# StrandConstraint Warning: can't construct Strand constraints for
(T0344)E87 and (T0344)L88 as there are no CA atoms with usable spacing
# StrandConstraint Warning: can't construct Strand constraints for
(T0344)I183 and (T0344)I184 as there are no CA atoms with usable spacing
# SheetConstraint # Warning: strands too close on chain:
(T0344)L166--(T0344)I167 (T0344)E175--(T0344)V176
but for the most part undertaker accepted the constraints.

The TIM-barrel-wannabe in try9-opt2 scores best with the try15
constraints, despite having the unusually low-separation parallel
strands (T0344)L166--(T0344)I167 (T0344)E175--(T0344)V176

Let's try a run from the alignments with the try15 costfc, and see if
we can come up with anything more reasonable.


Make started Tue Jul 18 12:01:01 PDT 2006
Running on orcas.cse.ucsc.edu

I restarted the make, after discovering that the notor and sep
alphabets weren't being included in the best-scores and alignment
computations.

Make started Tue Jul 18 12:04:46 PDT 2006
Running on orcas.cse.ucsc.edu

(And again---found another mistake in Make.main)

Tue Jul 18 15:53:12 PDT 2006 Kevin Karplus

Remake *finally* done.

Started try15 on cheep.

This may take *much* too long, as there are now 3282 alignments in
all-align.a2m

Tue Jul 18 18:10:35 PDT 2006 Kevin Karplus

Actually, try15 did not take very long.

It produced a model from alignment that seems to be coming final from
1z47A (though it is probably piecing together several alignments).
Well---all it gets is junk.  Maybe one strand-helix-strand that is ok
for V139-E175.

We could try doing an optimization run for each set of sheet
constraints separately, but I don't have time to set up such a set of
runs.

We should probalby also look at the o_sep and n_sep predictions.
For example, there is a strong prediction of
	Hbond K182.N	K179.O
	Hbond K182.O	K179.N

	Hbond I184.N	N177.O
	Hbond I184.O	N177.N
That is,
	SheetConstraint N177 K179	I184 K182	Hbond N177

There is also a crisp helix finish with
	Hbond I129.N	W124.O
and maybe
	Hbond I101.N	I96.O

There may also be a turn:
	Hbond G194.O	D189.N
	Hbond E192.N	D189.O


There is also a hint that V201.O may hbond with K208.N or W209.N


Wed Jul 26 12:09:14 PDT 2006 Zack Sanborn

I will start a set of optimization runs for each set of sheet constraints.
Actually, just the top5 model sheet contstraints so we don't tie up too
many computers.  I will strengthen the 'constraints' cost in these cost
functions to really try to get Undertaker to forces these constraints on
the structures.

The top 5 Robetta-constraints will be from the following models:

	T0344.try8-opt2  < robetta4
	T0344.try10-opt2 < robetta6
	T0344.try12-opt2 < robetta8
	T0344.try9-opt2  < robetta5
	T0344.try5-opt2  < robetta1

We will optimize try15-opt2 with the sheet constraints from each of these
models.

The five trys (try16 - try20) have been set up and are ready to go.

try16 (optimizing try15-opt2 with try8-opt2 constraints) was started on
orcas at 12:30.

try17 (optimizing try15-opt2 with try10-opt2 constraints) was started on
lopez at 12:31.

try18 (optimizing try15-opt2 with try12-opt2 constraints) was started on
camano at 12:33.

try19 (optimizing try15-opt2 with try9-opt2 constraints) was started on
vashon at 12:34.

try20 (optimizing try15-opt2 with try5-opt2 constraints) was started on
whidbey at 12:34.

Hopefully they produce something worthwhile!


Wed Jul 26 16:08:51 PDT 2006 Zack Sanborn

Okay, these models don't look they produced anything worthwhile.  In fact,
by Undertaker's standards, they all do more poorly than try15-opt2, the
model they were trying to optimize.  That is not a good sign.

Chris and I are pretty lost when it comes to what we should do with this
protein.  We may try to make a TIM-barrel out of try9-opt2, but maybe not.

Wed Jul 26 16:33:31 PDT 2006 Chris Wong

try21 starts with the alignments, but uses the sheets and helices from
try15-opt2 with some modifications as suggested by Prof. Karplus on
July 18.
( make -k T0344.do21 > & do21.log ; gzip -9f do21.log ) & started on
orcas at 1635.


Fri Jul 28 10:30:25 PDT 2006 Zack Sanborn

Well, try21 was a total failure.  Part of the reason may be due to the
fact that the initial constraints (a la try1, from the alignments), were
kept in.  Also, the helix constraints had no weight associated with them
(not sure what happens when there's no weight for a constraint).

[Sat Jul 29 17:08:55 PDT 2006 Kevin Karplus
Default constraint weight is 1.0
]


try22, which fixes the above things, was started on orcas.  It likely
will not work either.

Fri Jul 28 13:56:29 PDT 2006 Zack Sanborn

Well, wouldn't you know it... try22 also looks really, really bad.

I think the best thing to do now is attempt to make some structures out
of this most current model.  I have some ideas to make a nice sheet that
may or may not work.

Well, I just noticed that there are some big breaks in this model.  I
could try a run where I try to patch up these breaks AND also use the
same constraints with ever MORE weight to see if Undertaker can figure
something out.  I'll do that in addition to looking at some possibilities.

try23 started on orcas.

Now, the current best scoring model with an unconstrained cost function is
try8-opt2.  It has a nicely formed sheet, but then a bunch of junk.

I'm going to try to make a pair of strands I3-L13 and I20-V30 (strangely
hydrophobic, no?).  I'll move from there.

try24, with all strands from try8-opt2 plus the above strand, started on
orcas.  Also has all helices from try8-opt2 except for the one that
conflicts with one of the strands and another poorly predicted helix.

Also, George is giving us some manual top hits to start some runs off of.
He also believes that this protein could be multi-domained and that
breaking it into two (~100 aa's per), might be helpful.

George gave us the following manual top hits:

	1ois, 1griA, 1qr0A, 1dekA, 2kinA, 1b77A, 1gc1H, 1rhi3, 1aym3.

Chris is currently setting up a run with these manual top hits.

Fri Jul 28 16:30:04 PDT 2006 Chris Wong

try25 is a run that uses the manual top hits that George has
suggested.  He alse suggested including some scaled down RR constraints.

( make -k T0344.do25 > & do25.log ; gzip -9f do25.log ) &
started on shaw at 1630.


Fri Jul 28 17:31:11 PDT 2006 Zack Sanborn

try24 had finished, where I was trying to form a new sheet.
Unfortunately,
had I paid attention to the log file, I would have noticed an error with
the sheet constraint for the new sheet.

# SheetConstraint # Error: Hbond residue (T0344)K208 not in either strand
(T0344)I3--(T0344)L13 (T0344)V30--(T0344)I20

This sheet was not made by Undertaker.  try26  attempts to correct this
by fixing this error, starting from try24-opt2, which did appear to
move some backbone in the right direction.


Sat Jul 29 00:09:15 PDT 2006 Chris Wong

Just took a quick peak at the try's we did today... 23, 24, 25, 26 all
look bad.  I wonder what the subdomains that we made look like ?  I'll
take a look tomorrow.


Sun Jul 30 14:05:48 PDT 2006 Zack Sanborn

I'm hoping to put a bit of work into this target today... if only I knew
where to start!

It seems that the M1-V176 subdomain did not produce any targets, the
following errors were found in the try1.log.gz:

Error: Couldn't open file XXX0000.a2m or XXX0000.a2m.gz for input
# command:Error: unrecognized command START_COL skipped
# command:Error: unrecognized command XXX0000 skipped
Error: unrecognized command XXX0000.t2k.alpha.rdb skipped
Error: unrecognized command XXX0000.t04.alpha.rdb skipped
Error: unrecognized command XXX0000.t06.alpha.rdb skipped
Error: Couldn't open file XXX0000.dssp-ehl2.constraints or
XXX0000.dssp-ehl2.constraints.gz for input
Error: no target chain for CreatePredAlphaCost
Error: no target chain for CreatePredAlphaCost
Error: no target chain for CreatePredAlphaCost
Error: Couldn't open file Template.atoms or Template.atoms.gz for input
Error: Couldn't open file XXX0000.t04.undertaker-align.under or
XXX0000.t04.undertaker-align.under.gz for input

It looks like the make script was not able to replace the string XXX0000
with T0344 like it should.  The same errors were found in the other
subdomain L177-R229.  I'll have to investigate why.

In the meantime, I looked at my latest try26, where I attempted to get
that
sheet between I3-L13 and I20-V30 to be formed.  It still didn't form it,
but the wannabe-strands are sitting very close to one another.  I think
the problem is that the strands are oriented in such a way that they can't
form sheets without making some huge breaks.  It might be necessary to use
ProteinShop to reorient that section.

Just tried to use ProteinShop, but it's not putting in the correct
secondary structure (i.e. coils are helices, some helices are coils,
all strands turned into misshapened helices, etc.).  It would take an
hour to fix this secondary structure using ProteinShop, and it might
not even be worth the trouble knowing how "helpful" ProteinShop can be.

I'm going to look at the other models for some potential chimera pieces.

While I'm doing that, I'm going to try once more to build that sheet,
starting with try26-opt2, and upping the pseudocounts for some of the
Jiggle, Shift operators. I'm not hopeful that this will do anything.

try27 started on orcas.

Sun Jul 30 14:45:42 PDT 2006 Zack Sanborn

Well, I looked at some of the other try's.  To my (mostly untrained) eye,
it really appears that getting those subdomains working might help the
overall model.

So, I looked at the problem again, and I looked at the try1.under script
and noticed that "XXX0000" was still scattered throughout the document.
I copied try1.under and try1.costfcn to try2 and did a global replace of
XXX0000 with T0344.  I started try2 for L166-R229 and will see what
happens.

Well, just as I suspected... try2 will fail because our initial make
didn't make all the required files (Template.atoms, initial alignment
based sheet/helix/rr constraints, etc.).

I just noticed what *might* be a problem.  Knowing that Makefile's are
a little sensitive when it comes to format, I noticed that the line:

	START_COL := 166

Had an extra space between "START_COL" and the ":=" and no other line
formatted this way has this space.  I removed the space and am rerunning
the original make with the following command:

	(make -k >& make2.log ; gzip -9f *.log *.atoms) &

The log is piped to make2.log to preserve the original make.log.
Unfortunately, I somewhat doubt this is the problem... seems way to
simple.

Sun Jul 30 15:23:21 PDT 2006 Zack Sanborn

For lack of any better ideas, I'm going to make a chimera.  The current
best scoring model (for both Undertaker and Rosetta) is try8-opt2.  It
has a nice beta-sheet and a large section of it (residues D92--R229)
follows the secondary structure prediction pretty well.  However, the
first 91 residues follow the secondary structure prediction pretty
terribly.

Looking through the latest models, I see that try25-opt2 does fairly
well at getting the secondary structure prediction correct for this
section that try8-opt2 fails at.  So, let's try to make a chimera of
try25-opt2/M1-L91 and try8-opt2/D92-R229.

I haven't tried the whole superposition technique.  For another protein
where I was attaching subdomains, it worked pretty well if I just
copy-and-pasted the "domains" together without superposition as long
as there weren't enormous clashes.

After making the chimera (T0344.chimera.pdb.gz), I checked it out with
Rasmol and the two "domains" are pretty well separated.  Now, let's
see if Undertaker can put them together.

try29 optimizing T0344.chimera.pdb.gz, was started on lopez.

Oh, it appears George is attempting some work on this target as well.
He took the try28 spot.  I'll get out of the README for him to update it.

Sun Jul 30 15:38:21 PDT 2006 Zack Sanborn

Well, the subdomain "fix" I implemented didn't do a damn thing it
appears.  I got a bunch of different errors but most of them seem
related to not being able to replace XXX0000 with T0344 in some cases.
It's curious because it seems that most times it works.  I'm sure
Kevin has come across this problem, I'll ask him.


Sun Jul 30 15:57:13 PDT 2006 Kevin Karplus

Zack and Chris did not use the standard Makefile to create the
subdomains.  I moved M1-V176 to junk-M1-V176, and modified ./Makefile
to set up M1-V176 correctly.  I then started the M1-V176 make on vashon.

I leave it to Chris or Zack to do the other subdomains.

Sun Jul 30 16:00:23 PDT 2006 George Shackelford

Last comments lost when peep cut me off...
I increased the size of the database used for distant fold-recognition
by a factor of four. I have added two templates to the MANUAL_TOP_HITS,
1pqzA and 1ireB, and I am going to use them in try28.

Try28 running on peep.


Sun Jul 30 16:15:10 PDT 2006 Zack Sanborn

I've copied what Kevin did for the first domain to remake the subdomain
for L166-R229.  I started the make for L166-R229 on orcas.

Sun Jul 30 16:23:55 PDT 2006 Chris Wong

I think I forgot to add to this README that, on Friday, I was
attempting to get the subdomains going based on notes from CASP6.  I
selected the spot for the subdomain boundary based on what *I* thought
looked like a decent position based on the try8-opt2 model... one of
the least-worst models we have so far.  I included 10 residues of
overlap between the two subdomains.

Sun Jul 30 17:55:00 PDT 2006 Zack Sanborn

The optimized chimera try29 run has finished and, with an unconstrained
cost function it scores 4th best.  Weirdly, the chimera it was optimizing
has the top score, despite having a massive break!

try29-opt2 seems to lose points in how badly it is packed.  Looking at
the structure confirms these costs, there are many holes in this
structure.  However, I'm surprised it scores as well as it does.  I will
start another optimization of this structure, increasing the dry
weights and the phobic_fit values to see if I can get this structure
to pack a little better.

It might also help to take this structure's sheet and helix constraints
and restart from the alignments.  So far this structure does pretty
well at matching the ehl2 prediction (aside from a section that is
weakly predicted to be strand/coil and is helix/coil in the try29-opt2).
It might be that optimizing the current model won't get us the best
possible structure.

try30, optimizing try29-opt2, started on whidbey.


Sun Jul 30 18:12:59 PDT 2006 Zack Sanborn

I've set up a try31, which takes the sheet and helix constraints from
try29-opt2, and rebuilds a model from the alignments.  There is one
difference, I removed the helix constraint from V4-E18, since that
region was predicted to be strand/coil.  We'll see if Undertaker finds
a better way of making that region.

try31 started on shaw.


Sun Jul 30 18:15:38 PDT 2006 Zack Sanborn

try27 has finished and, like I imagined, it wasn't successful at building
that sheet I wanted.  Also, it scores terribly, with try27-opt1 scoring
badly and try27-opt2 scoring even worse.  Ah well.

Sun Jul 30 20:58:25 PDT 2006 George Shackelford

Try28 finished disasterously. I forgot to change the try15's to try28's
in the try28.under file; I've wiped out try15's results!!

Furthermore I did not set the costfcn the way I needed it. It had
constraints I never wanted and missed some I could use. And I left
ReadFragmentAlignment in!!

That is all fixed for try32. It should run faster and the results should
be better.

try32 running on peep.


Sun Jul 30 23:08:45 PDT 2006 Kevin Karplus

Actually, try28 did not wipe out try15, despite George's error.
The try28 versions were not gzipped (because they didn't have "try28"
in their names), so I renamed them correctly, gzipped them, and reran
"make T0344.do28" so as to do the repacking and gromacs runs (making
sure to delete the empty gromacs output first).

try28 may still be a disaster, as it scores terribly with the try28
costfcn.
This is probably because it was optimizing for the try15 costfcn.


Mon Jul 31 10:37:42 PDT 2006 Zack Sanborn

Well, try30 is now the best scoring model with an unconstrained cost
function.  Of course, that doesn't mean it's a good model, in fact it
is still packed terribly.

try31, where we attempted to make a new model from alignments using
the constraints from try29-opt2, was not very successful.  It keeps much
of the secondary structure, but is not very well put together.  Not
sure if it's worth trying to fix it.  We need some good models by
this afternoon.

George's try32 is interesting.  It has two separated subdomains of
mostly beta sheets connected by a long helix.  It would pack MUCH better
if we could get that helix to pack against the sheets.  And, I think
that's even possible, if ProteinShop didn't totally screw up the
secondary structure of inputted models.

Actually, I just tried opening up try32-opt2 in ProteinShop and, while
it mostly completely gets the secondary structure wrong, the helix that
I want to pack better is correctly identified.  So, I should be able
to ProteinShop a better packing of try32-opt2.

Mon Jul 31 11:26:00 PDT 2006 Zack Sanborn

I just re-packed try32-opt2 using ProteinShop.  I did the best I could
given time and ProteinShop-related constraints.  I've begun an
optimization
of the ProteinShop'd model (T0344.model1_renum.pdb) in try33.

try33 started on vashon.

I now want to take a look at our subdomain predictions.


Mon Jul 31 12:04:29 PDT 2006 Zack Sanborn

I looked at the subdomain predictions and have put them together using
a combination of "cat" and ProteinShop.  Given my limited time, I didn't
both trying to pack things nicely with ProteinShop.  I simply separated
the chains far enough that I hope Undertaker will figure out the
best way to pack things.  The ProteinShop model can be found in
decoys/T0344.model2.pdb.gz

try34, optimizing T0344.model2.pdb.gz, upping breaks, clashes, etc.,
started on orcas.

Also, George spoke to me about his try32-opt2.  Apparently, the template
it was built from was a "clam"-like protein where there were two domains
separated by a long helix that acts as an extended hinge.  So, it might
be worth just optimizing try32-opt2 also, to clean up some of the breaks,
clashes, etc.


Mon Jul 31 15:17:54 PDT 2006 Zack Sanborn

try34 has finished, which attempted to optimize a ProteinShop'd chimera
of the subdomains M1-A165 and L166-R229.  It scores 4th best with an
unconstrained cost function, which is surprising considering it started
out life as a ProteinShop model.  I will try another optimization of it
to close up some of the breaks.

Hmmmm, I just noticed a try35 that looks like George has started. It
appears that it starting from a particular alignment 'long', but it's
hard to say.  I'm sure George will update the README when he gets a
chance.

try36, optimizing try34-opt2 reducing breaks, was started on orcas.

Mon Jul 31 15:43 PDB 2006 George Shackelford

try32 actually looks decent. It has some bad breaks and a couple of
helices didn't form but it does match well with the ehl2 constraints
and it has really good hbond scores. Fixing the breaks and clashes is
probably a matter of doing a polishing run from gromacs-repack. I just
don't have time to do it.

Can't seem to get word-wrap to work...

Mon Jul 31 16:15:27 PDT 2006 Kevin Karplus

esc-x auto-fill-mode toggles the automatic wrap.
You can make it turn on by default by including
	(setq text-mode-hook '(lambda() (turn-on-auto-fill)))
in your .emacs file in your home directory.

We need to submit this target tonight, and I see lots of comments, but
no coherent consensus about what models to submit.


Mon Jul 31 16:27:30 PDT 2006 Zack Sanborn

Sorry Kevin, we were getting to that part.  Two more runs are still
going (try33 and try36).  try33 is the optimization of a ProteinShop'd
version of try32-opt2 (George's model).  try36 is the optimization of
try34-opt2, and has a good chance at improving that model (which is
currently fourth best.

Currently, I'd suggest submitting the following models:

	T0344.try30-opt2.pdb (Top scoring model)
	T0344.try8-opt2.pdb  (2nd highest, robetta4)
	T0344.try34-opt2.pdb (4th highest - from subdomains)
	T0344.try29-opt2.gromacs0.repack-nonPC.pdb
			     (Top Rosetta scoring)
	T0344.try32-opt2.pdb (George's model, for variety!)

But, this list might change based on what happens with the runs still
going.

Mon Jul 31 16:40:47 PDT 2006 Zack Sanborn

try33 has finished and scores VERY poorly with the unconstrained cost
function.  The model actually looks fairly reasonable, just needs a
LOT of work before it can be considered a good model.  So, I don't think
it should be put into this list.  If we had more time to work the
kinks out of this structure, it might be worth it.  Unfortunately, it has
to be done in an hour or so.

Still waiting for try36 to finish.  Unfortunately, it hasn't even gotten
halfway (no try1-opt1, yet).


Mon Jul 31 17:13:58 PDT 2006 Zack Sanborn

Well, try36 hit the halfway point.  I think this means it's a couple
hours away from getting try36-opt2.  Good news is that it looks like
try36-opt2 will be a pretty good model.  try36-opt1 currently scores
third best, edging out try30-opt1.pdb by a fraction of a point.  This
makes it likely that try36-opt2 could be the highest scoring model,
beating try30-opt2 by a little bit.

I will use my powers of ESP to predict the following submissions:

	T0344.try36-opt2.pdb (Top scoring model - from subdomains)
	T0344.try30-opt2.pdb (2nd highest, from try29-opt2: chimera of
				try25-opt2/M1-L91 and try8-opt2/D92-R229).
	T0344.try8-opt2.pdb  (3rd highest, robetta4)
	T0344.try29-opt2.gromacs0.repack-nonPC.pdb
			     (Top Rosetta scoring, optimized chimera of
				try25-opt2/M1-L91 and try8-opt2/D92-R229)
	T0344.try32-opt2.pdb (George's model, for variety!)

I will let Kevin know of these suggestions when the time is right (i.e.
when try36 is done).  I've updated the superimpose-best.under with
these suggestions, but quite run it yet since try36-opt2 does not
exist yet.

Mon Jul 31 18:21:42 PDT 2006 Zack Sanborn

Updated the T0344.method file with the above suggestions.


Mon Jul 31 19:20:38 PDT 2006 Kevin Karplus

decoys/score-all.try36.pretty lists the top models as
	try30-opt2, try29-opt2, try8-opt2, try10-opt2, try36-opt2

Zack, do you want to revise your list?

Mon Jul 31 21:18:57 PDT 2006 Kevin Karplus

Zack is right that try36-opt2 tops the unconstrained list, though, so
maybe he did intend this list.

Mon Jul 31 21:47:18 PDT 2006 Kevin Karplus

Submitted with comment

    T0344 is essentially an ORFan, so all our preliminary methods (local
    structure prediction, HMM-based fold-recognition, and residue-residue
    contact prediction) are of limited use, because there is no
    evolutionary information in the multiple alignment to exploit.

    As a result, we were reduced to rather blind guessing  (politely
    called ab initio methods).

    Model 1 is try36-opt2, our top scoring Undertaker model, built from a
	    pair of subdomain predictions (on M1-V176 and L166-R229), then
	    optimized with two rounds of undertaker optimization.
	    try36-opt2 < try34-opt2 < model2. The model2 file was created
	    using ProteinShop to merge subdomain predictions, but it is
	    not clear from the README file exactly which subdomain
	    predictions went into creating model2.

    Model 2 is try30-opt2, 2nd highest scoring model optimized through two
	     rounds of undertaker optimization from a chimera made by
	     putting the residues M1-L91 from try25-opt2 and D92-R229 from
	     try8-opt2 together.

	     try30-opt2 < try29-opt2 < chimera.
	     try25-opt2 < alignments
	     try8-opt2 < robetta4

    Model 3 is try8-opt2, 3rd highest scoring model, made by optimizing
	    the ROBETTA_TS4 server model with undertaker.

    Model 4 is try29-opt2.gromacs0.repack-nonPC.pdb, the highest scoring
	    model (according to rosetta) of those we had rosetta repack
	    sidechains for.
	    try29-opt2 < chimera.
	    After undertaker optimized try29, the model was reoptimized
	    with gromacs and sidechains (except PRO and CYS) were repacked
	    by rosetta.


    Model 5 is try32-opt2, a model made from an alignment to template
	    1pqzA, which George Shackelford chose, even though it does not
	    come up as a hit with our standard fold-recognition method.
	    George left no explanation in the README file for why he chose
	    1pqzA.  We are including it, even though it scores very badly
	    and still needs a lot of work to close gaps, because it
	    represents a somewhat plausible model that is quite different
	    from our other guesses.

------------------------------------------------------------

Tue Aug  1 04:57:31 PDT 2006 George Shackelford

Unfortunately my comments concerning try32 were lost when I lost the login
to bark. For some reason, my ssh connections to machines at school get
disconnected rather arbitrarily. When I am using emacs to edit the README
that means that what I have entered is simply lost. Currently I am using
kwrite via 'fish' which keeps a copy in memory at home and connects and
uploads the document when I save.

The template 1pqzA was the top scoring match using the "alphabetmatch"
program. The output is as follows:

# program: alphabetmatch
# George Shackelford
#
# Target: T0344
# length: 229
# length range: 212 to 251
# alphabets used:
#   ehl2 burial
#
id      score   per residue
5S      10N     10N
1pqzA   516.258 2.2544   ,3.30.500.10-137,2.60.40.10-101
1ois    503.966 2.20072  ,1.10.10.41-92,2.170.11.10-128
2aefA   501.921 2.1918
1cov3   489.522 2.13765  2.60.120.20-238
1ireB   489.088 2.13576  ,1.10.472.20-126,2.30.30.50-99
1lnuB   488.809 2.13454
2gk4A   483.64  2.11197
2amyA   483.213 2.1101
2gqrA   481.628 2.10318
1ugpB   480.387 2.09776  ,1.10.472.20-126,2.30.30.50-97

Notice that the score for 1pqzA is not only the highest but appears to be
on a plateau above the scores starting with 1cov3 and score 2.13765.
Scores decline much more slowly after that. Because 1pqzA's score was
(somewhat) in a class by itself, I decided to focus on using it as a
template.

The results does match to the predicted ehl2 constraints.