Mon May 22 09:36:54 PDT 2006
T0296
Make started Mon May 22 09:38:28 PDT 2006
Running on camano.cse.ucsc.edu

Mon May 22 09:44:30 PDT 2006 Kevin Karplus

BLAST finds no close homologs in PDB.

Mon May 22 10:31:45 PDT 2006 Kevin Karplus

The more sensitive t06 search also finds no PDB files in the multiple
alignment, though there is enough diversity in sequences that are
found to have some conservation signals.  The t2k and t04 multiple
alignments seem to be finding the same number of sequences as the t06
multiple alignment.

Mon May 22 14:20:01 PDT 2006 Kevin Karplus

The top-scoring templates are mostly TIM-barrels, but the best E-value
is only 0.166, so this is not a strong endorsement of the fold.  The
third highest score is for 1j0dA, which is a different fold entirely.

The protein is long enough (445 residues) to have multiple domains, so
we might want to divide the protein up, though I don't know yet where
I'd put domain boundaries.

Mon May 22 18:41:37 PDT 2006 Kevin Karplus

It looks like we have good matches up to about L260, but the conserved
C-terminal region seems to be the part that we can't predict structure
for!

Mon May 22 21:24:43 PDT 2006 Kevin Karplus

The try1-opt2 run made a mess of even the "easy" part where most of
the alignments pretty much agreed on the sheets.

We may have to try breaking this into two domains, perhaps around A272 ???
I'll start a subdomain run for M1-A272.

Thu May 25 15:38:31 PDT 2006 Kevin Karplus

I looked at M1-A272 try1-opt2 and see that the first part looks ok,
but the last two strands have not gotten lined up.  These strands are
(approx) V207-V211 and G265-D268 and are probably parallel to the
strand V179-V184.

Here is a potential alignment:
178   gvaKLVVFan
201 gvgeadVIINVgv
261   gvefGIVDLSla

The Hbond would be coming off A180.

Mon Jun 5 17:11:32 PDT 2006 George Shackelford

I rerun the rr.constraints and have written a program
"factor_constraints" to factor the weights down to .1 of the original
value. I am going to paste in the ehl2 constraints and the align
sheets and check for the suggested alignment above.

Mon Jun 5 19:46:39 PDT 2006 George Shackelford

I've been studying the under-align.sheets, and looking at the
conserved_t06, and it looks as though the barrel that it's trying to
form is inside-out. Burial makes it look really out-of-kilter. 70-74
is clearly wrong.

So I've taken out all sheets involving 70-75. I'm putting in rr
constraints factored by 0.1. I'm going to do another try and see what
we might get.

M1-A272 try2 running on peep

I have setup a new domain D268-N445 and have started it up on orcas
Hope it works ok.

Tue Jun 6 10:04:33 PDT 2006 George Shackelford

I have looked at the try1 for D268-N445 and it looks ugly. There are
strands that should not be buried forming a parallel sheet, and part
of a helix that should buried and is likely a strand, etc. I'm going
to factor down the rr constraints and drop the alignment sheets to put
more emphasis on the predicted ehl2 for try2. Let's hope it gets
better results.

try2 for D268-N445 runnin on peep

Try2 for M1-S272 is still trying to form a barrel. There are 6 strands
in place; we need two more. Try2 does score slightly better
than try1 but there are still some pretty bad breaks. The alignment of
the barrel strands are going to need some manual adjustments as
well. The strand starting with 70 should start at 74 (despite what the
align sheet constraints suggest). One missing strand appears to be
20-27. Another is 194-201. The strand starting at 263 should start at
264 or 265.

The conserved_t2k residues look to cap one end, the other end may need
capping as well.

I'm uncommenting sheet constraints involving 76-78 and 75-76. I'm also
going to use the unfactor rr.constraints. Meanwhile I can be checking
the barrel.

try3 not yet started! Need a free computer!

try2 has stopped due to missing template atoms file?? Will try to
restart with uncommented build of template atoms!

try2 restarted ~ 12:08

Tue Jun 6 14:00:07 PDT 2006 George Shackelford

try3 for M1-A272 started on peep.

The try2 for D268-N445 finished. It scores better than try1, but has
bad breaks, and the matches to ehl2 and burial are poor. The rr
constraints don't seem to match as well. There must be some alignment
it likes a lot.

OK, we're going to punch up the constraints to 25, return to
rr.constraints (before the factoring) and increase near_backbone to
15. Perhaps I should increase breaks as well, but let's see how this
does.

try3 for D268-N445 started on whidbey (!)

Tue Jun  6 20:23:48 AKDT 2006 George Shackelford

Both try3's are improvements. The try3 for M1-A272 still needs the strands
of the barrel lined up (ASSUMING THERE IS A BARREL!). I am working with
the current generated sheet constraints to finish the barrel.

The other try3 is more problematical. It is quite different from the
orignal decoys but the phobic fit is not good. This might not be a problem
depending on the fit with the barrel. At least combined they are a big
improvement over the original T0296 try1.

SheetConstraint (T0296)I111 (T0296)G113	(T0296)K73 (T0296)V75	hbond
(T0296)G112	1
SheetConstraint (T0296)F110 (T0296)F114	(T0296)C144 (T0296)N148	hbond
(T0296)I111	1
SheetConstraint (T0296)C144 (T0296)N148	(T0296)V179 (T0296)V183	hbond
(T0296)S145	1
SheetConstraint (T0296)K181 (T0296)L182	(T0296)D206 (T0296)V207	hbond
(T0296)L182	1
SheetConstraint (T0296)L182 (T0296)F185	(T0296)I208 (T0296)V211	hbond
(T0296)V184	1 ok
SheetConstraint (T0296)N210 (T0296)G212	(T0296)V267 (T0296)L269	hbond
(T0296)N210	1 not

211 in
267 out
SheetConstraint (T0296)N210 (T0296)G212	(T0296)D268 (T0296)S270	hbond
(T0296)N210	1 ok

211 in
187 in
210 to 268

? 48
73
111
148
183
211
267

Hmmm. There doesn't seem to be any way we'll get enough strands to make a
proper barrel. I must re-think this. I have found a nice web page with a
variety of (mainly) parallel sheets. I think I need to try to fit one of
them and double check rr constraints. I think the templates with barrels
are misleading us. It is all too easy to find a barrel that will match six
strands where the six strands don't form an actual barrel. I should take a
look at the 1j0dA Kevin noticed.

Of interest is the strong long helix I37-E65. This suggests something
along the line of:
2|3|4|1-5|6
where | means parallel and - means anti-parallel. However 6 seems a long
way from 5...
see:
http://kinemage.biochem.duke.edu/~jsr/html/anatax.3c.html

After looking at 1j0dA, I am finding it quite appealing. It matches up
well (except for the "seventh" strand). I'm going to try and use it as a
template (above others) for another run.

I'm going to score the what I have plus all the servers. I like how
forcast-s_AL1 and FORTE2_AL1 look.

Ouch. There are some really bad predictions out there. I found a couple
more that I'd like to include. Hope this doesn't screw up undertaker.

Try4 for domain 2 running on peep.

Wed Jun  7 23:30:58 PDT 2006 George Shackelford

After a burp (needed to add the 1j0dA as a MANUAL_HIT), try4 finished. It
is definitely different from try3; it doesn't try to build a barrel. On
the other hand, it doesn't score better. I'm going to do a night run with
the rr.constraints kicked back up and include both tries in the mix. At
least the server includes didn't make things crash.

I'm cranking phobic fit up to 4 as well.

try5 for domain 1 running on orcas.

For try4 in D268-N445 I'm going to drop the constraint weight to 15, and
up the phobic fit to 4 yet dropping near_backbone to 5. Just trying to get
something going.

try4 running on orcas.

Thu Jun  8 11:55:40 PDT 2006

The try5 for scores well, but it is still trying to bend the sheet around.
I looked at try4 for domain 2 and it actually scores worse.

I am concerned that the formation of the parallel sheet actually spans the
two domains. The real solution requires applying what we have learned
(esp. about 1j0dA)  to a new try at the original T0296 level. I am taking
what I have for try5.under and .costfcn for the first domain and using
them as a start for try2. I'm changing phobic fit back to 2 and
constraints weight to 15. I have to take out the includes of two trys
because they are for the first domain. Hope this works.

try2 running on peep

Fri Jun  9 21:10:56 PDT 2006 George Shackelford

At the meeting today, I discussed the structure of 1j0dA and how it wraps
around to form a sheet using strands from the start, middle, and end
sections of the sequence. Because of this 1j0dA fails to qualify as two
domains. I believe that is the case for this sequence as well. Now I'll
need to do all work as one domain.

One good point that came up is that we can get some other sequences from
SCOP 79.1.1.1. Of course most of the sequences consist of chains
A,B,C,D,E... I can still get some that are different.
I'm liking the rr.constraints but I don't want to overwhelm with them.
Still I need to get another run going.

I can see where we need to take T22	T24 and put it next to M406
E408 or
between it and G265	D268. That tends to match what I see in 1j0dA.

I need to include other SCOP proteins that match 1j0dA such as:
1p5jA	372
1bksB	397
1qopB	396
1kl7A	514
1a50B	396
2tysB	397
1k8yB	396
2wsyB	396
1e5xA	486
I chose those which are greater than 370 residues in length. Hopefully
this will help in aligning with the target. A number of these have 2tysB
as their FSSP representative.

I THINK I can add these as MANUAL_TOP_HITS and run make again to get
new alignments, then add them manually to the *.under files.

I have completed the 'extra_alignments','read_alignments' and
'all-align.a2m.gz'. Now to add to try3.under and do another run.

try3 running on peep.

Well, that was a disaster. Try3 didn't do as well as try2. I really
need to get that 24-406-265 sheet built correctly. Until I do that, I
don't think things are going to get better.

First I am going to take out the server pdb's I've included. I think
they can only gum up the works. I'm loosening the rr.constraints back
to rr.0.1.constraints.

Let's take a look at near-backbone to help align the strands. I need
to manually build all of that sheet, not only the first three strands
but the remaining three as well.

try4 running on peep

Sun Jun 11 00:09:34 PDT 2006 George Shackelford

NO, NO, NO. I took a look at try4-opt1 and I stopped the run. All that
came out were helices. It broke badly. I am going to try something
else real quick before bed. I'm going to crank up rr.constraints
(again!) and I'm going to reactivate a couple of the server
pdb's. This is only a guess.

I'm also increasing the weight of constraints to 20.

try5

Sun Jun 11 12:20:11 PDT 2006 Kevin Karplus

try5-opt2 does not score quite as well as try2-opt2 with the
try5.costfcn, though the constraints are slightly better.
Part of the problem may be that the individual rr constraints have
much more weight than corresponding constraints from the helices and
strands.

It would probably be a good idea to increase the weights for the
sheets---somewhat for the sheets that have already formed (say weight
5) and considerably for desired new sheets (say weight 20).

Sheet constraints for already formed sheets can be picked up from
decoys/try*-opt2.sheets and from T0296.undertaker-align.sheets

I'll leave George to figure out which sheets he wants to try to
encourage.  This evening I'll do a preliminary submission of 5 models
that are different from each other (tentatively, try5, try2, try1, and
2 alignments, but I'll change that based on what George does today).

Sun Jun 11 13:00:17 PDT 2006 George Shackelford

Back on-line after a few problems here at home with the computer...  I
have been looking at what happened to try4 and try5, and I know that I
do need some rr.constraints (at a rather high level) just to keep from
falling into a bunch of helices.

[I DISAGREE--sheet constraints are much more important than rr
constraints for preventing helices.---KJK]

Another observation is how much better the t2k set of sequences is
than t04 and especially t06. t2k contains a number of the SCOP
sequences (c.79.1.1) where I believe T0296 belongs and t06 contains
one or two. Is this due to "contaimination" by TIM barrels of t06 or
is this a due to t2k being more forgiving and accepting of different
sequences?

It would be a good check if we ran a fold recognition test on t2k,
t04, and t06 where we limited our sequences to those where the
e-values were e10-2 at best. This emulates the situation where we have
weak fold recognition / new fold. I think that t2k is better in this
area; when looking at weak FR / NF, we often look at t2k predictions
because they seem better.

[AGAIN---I DISAGREE.  T2k is better when you have easy fold
recognition, but for hard fold recognition, t06 seems to be better No
harm in trying though, as we're pretty desperate for anything on this
target.  ---KJK]

With this possiblity in mind, I have embarked on doing a t2k based
rr.constraint prediction and including t2k undertaker alignments. I am
also going to include t2k.alignment constraints as I can.

The new rr.t2k.constraints are showing predictions in the so-called
"domain 2." I find them plausable. Furthermore the probabilities are
lower; that would make sense because our information really is weaker.

try6 running on peep.

Try6 stinks. I'm going to re-include rr.0.1.constraints and removing
the bonus flag from the top rr.t2k.constraints, re-include some of the
server models (they must be providing some kind of alignments that
help us) upping the weight on constraints to 20 (again). I'm running
out of time here for the soft deadline. I don't think I'm going to
really get any better. Eventually, I'll have to take the best of the
domain 1 sheets and lock them in place while the domain 2 part is
sorted out. Going to be difficult.

try7 running on lopez.

Sun Jun 11 20:28:38 PDT 2006 Kevin Karplus

I'm going to send in a preliminary prediction now, even though try7
isn't done.  If it looks better than what we have, we can replace it
in the morning.  There is (probably) still time to work on this after
the soft deadline.

Sun Jun 11 23:05:00 PDT 2006 George Shackelford

Having looked at try7, I don't think we'll be replacing anything with
it. It still doesn't build that bridge we need. Sigh. I'll have to pick
up on this Tuesday.

Fri Jun 16 12:09:38 PDT 2006

From what I have learned in working on 299, I am doing try8 by
commenting out the ReadFragments early on, putting the TryAllAlign
back in, and using the t04 rr.0.1.constraints. Let's see what we can
get now.

try8 running on peep

Wed Jun 21 15:58:20 PDT 2006 George Shackelford

Try8 at least gets the 27-31 strand across to where I want it but no
sheets form. Even the good early ones are lost. I'm going to do
another try that might yet get what I need. I'm going to actually
include the working sheets and the proposed sheet in the constraints!
I can't believe I left them out for try8 but I did. In this one, I'll
make a concerted effort to complete the 36x-39x-2x sheet. I've noticed
from the t06.str2 predictions that there is an interesting similarity
between 39x and 2x stretches. I think I'll try to match T22 to
T394. Will that bond work?

I'm not quite convinced about the 39x to 36x sheet. I think 39x to 32x
might be the correct one. Currently we're missing a 36x to 32x
connection.

I'm really not convinced about the 415-265 sheet. I'm commenting it
out. I think that the alignment is trying to force it.

# the sheets from try7 which has the sheets < 230 correct.
# I can keep those and work on the
SheetConstraint (T0296)I48 (T0296)T49	(T0296)N72 (T0296)K73	hbond
(T0296)T49	5
SheetConstraint (T0296)V71 (T0296)S76	(T0296)G112 (T0296)L117	hbond
(T0296)K73	5
SheetConstraint (T0296)G113 (T0296)S115	(T0296)S146 (T0296)N148	hbond
(T0296)F114	5
SheetConstraint (T0296)C144 (T0296)G150	(T0296)V179 (T0296)F185	hbond
(T0296)S145	5
SheetConstraint (T0296)I209 (T0296)N210	(T0296)V184 (T0296)F185	hbond
(T0296)N210	5
# These MAY be forming the other sheet (in the "2nd" domain).
# This next one is dubious - we're not using it
# SheetConstraint (T0296)G265 (T0296)L269	(T0296)T415 (T0296)M419
hbond (T0296)D268	10
SheetConstraint (T0296)S344 (T0296)L345	(T0296)A326 (T0296)F327	hbond
(T0296)L345	10
SheetConstraint (T0296)I364 (T0296)I366	(T0296)A394 (T0296)R396	hbond
(T0296)A365	10
# the missing constraint??
SheetConstraint (T0296)T22 (T0296)G26	(T0296)A394 (T0296)I398	hbond
(T0296)T22	50
# I still need to connect 36x to 34x (I think)

try9 running on shaw(!)

Wed Jun 21 22:53:42 PDT 2006 George Shackelford

try9 isn't doing well. I'm going to do a try10 where I include the
try7.constraints and try7.helices. I'm also removing all but the 1j0dA
alignment. I want to see sheets!

try10 running on shaw.

Great. I just managed to kill the try9 when trying to restart the
try10. Well I didn't like what I saw of try9 anyway...

Thu Jun 22 14:22:44 PDT 2006 Kevin Karplus

The unconstrained costfcn prefers try2-opt2, try7-opt2, try5-opt2,
try3-opt2, try1-opt2.

It doesn't look like the servers were ever properly scored and looked
at with the unconstrained costfcn.  The scoring with the try1 costfcn
seems to have moved TIM-barrel wannabees to the top of the list.  They
are not very convincing.

The SP4_TS5 model looks interesting though, suggesting a subdomain for
T78-P437 and an outer domain.  The outer domain doesn't match our
secondary structure, but the inner domain looks promising.  I'll start
a subdomain prediction for it.

(T78-P437 prediction started on lopez).

Thu Jun 22 15:13:00 PDT 2006 George Shackelford

Unfortunately I had comments from a earlier that got lost due to changes
in the README by Kevin.

>T0296 SP0239  Streptococcus pneumoniae, 445 res

I still see this as belonging to the SCOP SCOP 79.1.1.1. a tryptophan
synthetase for bacteria. The long helix fits nicely. Looking at
subdomains won't get this since it's really a single domain. However
Kevin can certainly look.

Try10 pushed the constraints at the expense of the actual structure. It
is still trying to match ~25 to ~415. I don't buy that unless ~25 matches
~395 on one side and ~415 on the other. I will try that as a constraint
and see if that can work. Otherwise I will reduce the strength of the
constraints so a viable structure can form.

I'm also putting these alignments back in:
InfilePrefix 1bksB/
include read-alignments-scwrl.under
InfilePrefix 1p5jA/
include read-alignments-scwrl.under
InfilePrefix 1qopB/
include read-alignments-scwrl.under
InfilePrefix 1kl7A/
include read-alignments-scwrl.under

try11 running on shaw

Thu Jun 22 18:12:19 PDT 2006 George Shackelford

Looking at try11-opt1, it seems that I must have set the constraints (or
something) wrong because the sheet (one sheet) if formed gets everything
messed up. I'll check on it a bit later, but I'm getting hungry now.


Thu Jun 22 19:34:20 PDT 2006 Kevin Karplus

I'm worried about this loss-of-comments problem in README files today.
It only seems to affect files that both George and I edit, but it has
happened twice today.  I *think* that I've been careful to check that
I'm working on the latest version of the file everytime I edit, and I
haven't had problems with files where Firas and I have both been
commenting, or Grant and I.  George, have you been saving your changes
as soon as you make them?  Comments left in an unsaved edit buffer are
sure to get lost.

It is possible that I have been adding to old buffers without updating
them first---I'm certainly short enough on sleep to be making mistakes
like that.  I'll try to be doubly careful.

Thu Jun 22 19:40:07 PDT 2006 Kevin Karplus

George, can you explain your reasoning for believing in the tryptophan
syhthetase?  (Sorry if it was in the text that got lost.)  The
multiple alignments have nothing but proteins of unknown function, and
our best hits were to TIM barrels.  It is true that the next best hit
after the TIM barrels was 1j0dA, (c.79.1.1), but you have not yet come
up with a convincing alignment to it.

Thu Jun 22 20:47:14 PDT 2006 George Shackelford

Ah, yes.

While working on the protein as two domains, I noticed that there were not
enough strands to form a TIM barrel, in fact the length did not match a
typical barrel (I believe I mentioned this in the README). I looked into
the second domain, and found it failed to provide the necessary extra
strand (or strands) to form a barrel. So I decided that this was not a TIM
barrel with a second domain.

That's when I decided to go back to the main protein and not focus on two
domains. I found your comment about 1j0dA being the third highest scoring
as well as being a different fold. I took it and several other c.79.1.1
matches that I found in t2k-80-60-80+str2+near-backbone-11.dist. In fact
that distribution had the most c.79.1.1's. Since I am partial to str2 and
near-backbone-11 (they form the two 2ndary predictions I use in the RR
neural network) I found that more appealing.

When I ran TOP_HITS and included those, I got try2. It actually is
encouraging. The ehl2 script shows the first sheet forming pretty well
complete the expected five strands (when compared to other c.79.1.1's. All
have a first region with a five strand sheet) and the start of the other
six+ strand sheet in the other region. What was missing was the strand at
the end of the long strongly predicted helix which stretches across to
finish the other sheet. I figured that once I had set up the final sheet,
that all would be well.

Unfortunately I have not succeeded in forming that final sheet. Although I
defined that final sheet by try9, nothing seems to have come out
correctly. Perhaps I need to start again from try2 and use what I have
learned about Undertaker to get that last sheet in place.

And now for a bit on this family of proteins. SCOP c.79.1.1 designates
tryptophan synthetase. When I looked at some of the examples of this
family, I found they dimerize, in one case to a TIM barrel. I think that
the other dimer works to regulate the synthetase and that the long helix
elongates as part of the regulating. I should point out that the dimer
face is basically the five strand sheet.

So that brings us to the present. I'm not suprised that try2-opt2 scores
well. I just wish I could get it to finish what it started...

Ok, I'll do try2 -again- but adjust the constraints to what I want.
# my settings on these may be too mild
SheetConstraint (T0296)S344 (T0296)L345	(T0296)A326 (T0296)F327	hbond (T0296)L345	5
SheetConstraint (T0296)I364 (T0296)I366	(T0296)A394 (T0296)R396	hbond (T0296)A365	5
# the missing constraint??
SheetConstraint (T0296)T22 (T0296)G26	(T0296)A394 (T0296)I398	hbond (T0296)T22	10

I scaled down the rr.constraints to rr.0.1.constraints

try12 running on peep

Fri Jun 23 04:41:05 PDT 2006 Kevin Karplus

Since I rather liked the SP4_TS5 model, I ran it through VAST to see
where it came from.  The best hits were to 1gkpA, 1xrtA, 2aqvA.
1gkpA ranks 270 in the t06-80-60-80-str2+near-backbone-11 scoring 
1xrtA ranks 84 in t2k-100-40-40-str2+CB_burial_14_7, and those are
about the best these templates do with the HMMs.  Either SP4 has found
a hit that we cannot, or their model is wishful thinking.

I'll add 1gkpA and 1xrtA to our list of alignments and see if we can
generate anything from them.  I suspect that this will be useless, as
the HMMs score poorly enough that they probably can't align decently
for these templates.

Fri Jun 23 05:00:13 PDT 2006 Kevin Karplus

try13 started on whidbey to test the 1gkpA and 1xrtA alignments.

Fri Jun 23 10:40:59 PDT 2006 George Shackelford

Try12 fails as those before. Trying to get the long helix in place
results in the same "doughnut" as before. Try13-opt1 already looks
better. Unless there is a way to use ProteinShop to move the long
helix of try2 into a possible position, try2 is going to be the best
version of the 1j0dA I can get. I'm hitting a wall here and I have
another deadline on Monday.

Kevin comments that the HMMs score poorly of aligning to 1gkpA and
1xrtA. I suspect that I'm not getting the best alignment for 1j0dA.
With so little sequence similarity, alignments seem near impossible,
even when there is good agreement between the template's secondary
and our own secondary prediction.


Fri Jun 23 10:54:00 PDT 2006 Kevin Karplus

George definitely started editing an old version of this file, wiping
out my comments on try13, which is a complete misaligned mess.

I have started try14 on whidbey to polish up the SP4_TS5 model, even
though I have misgivings about submitting such a model, as we can't
reproduce the hit.

Fri Jun 23 13:30:10 PDT 2006 Kevin Karplus

The T78-P347 predictions don't look very useful.  To complete a
TIM-barrel the T78-P347 try1-opt2 added a lot of strands that were
predicted to be helices.

Fri Jun 23 17:30:23 PDT 2006 George Shackelford

I've looked about and based on unconstrained scoring and differences
in models I propose the following list to submit:
try2-opt2.pdb
try7-opt2.pdb
try12-opt2.pdb
try14-opt2.pdb
try5-opt2.pdb

Best we can do.

Fri Jun 23 18:02:25 PDT 2006 Kevin Karplus

I will do a submission, but a lot of George's work on this target was
wasted, as he was not working from alignments as he thought, but just
repeatedly polishing three server models.

Here is the extra text for the submission:

    Model 1 is try2-opt2, which scores best with an unconstrained costfcn,
    and with the try5 costfcn.

    Model 2 is try5-opt2, which has a few sheet fragments and scores
    second best with the try5 costfcn.

    Model 3 is try1-opt2, the result of fully automatic prediction

    Model 4 is try7-opt2, which appears to be based an optimization of
	karypis.srv.2_TS3 


    Model 5  is try14-opt2, which was optimized from the server model
	    SP4_TS5, which scored surprsingly well with our try1 cost
	    function.  It appears to have been generated from 1gkpA or
	    1xrtA, and so we attempted to get our own alignments to these
	    templates.  The alignments our HMM provided resulted in
	    absolutely terrible models with very poor match to predicted
	    secondary structure, so we just refined the SP4_TS5
	    model with an unconstrained co

    Note: we had not intended to submit so many models based on server
    models, but there was an error in the scripts that were run to do
    optimizations, so that runs that were *supposed* to be starting over
    from alignments were actually using only 3 server models as starting
    points and just polishing them.  This resulted in a lot of frustration
    for the student, as things didn't get close to what he wanted, but
    indentifying the error in the scripts came too late to help much.


Fri Jun 23 18:07:41 PDT 2006

I have started try15. I have uncommented the TryAllAligns, removed the server decoys, but I am still using the t2k, t04, t06 alignments and all-align.a2m.

try15 running on peep.

I have also started try16 with a try2 sheet contraints and without all
the all-align.a2m. I hope these do better. I'm still learning how to
do adjustments to *.under.

try16 running on shaw.

Fri Jun 23 20:13:55 PDT 2006 Kevin Karplus

try16-opt1 just finished and it scores best with the try16 costfcn.
It looks like a TIM-barrel wannabe to me.
None of the 1j0dA alignments were scoring very well, so it added a
1ps9A alignment at the end which took over.

Note: even try2 came from karypis.srv.2_TS3, so all the tries from
try2 through try14 are from servers, not from our alignments.

I should see what the karypis.srv.2_TS3 model is based on, and make
sure that we are using alignments to its structure.
I started a VAST search from http://www.ncbi.nlm.nih.gov/Structure/VAST/vastsearch.html
Request ID: 871945564999004940
to see what templates were used.

Fri Jun 23 20:48:31 PDT 2006 Kevin Karplus

The VAST site seems to be unreachable now:
traceroute www.ncbi.nlm.nih.gov
traceroute: unknown host www.ncbi.nlm.nih.gov

I'll try again later.

Fri Jun 23 21:05:17 PDT 2006 Kevin Karplus

VAST search done.  The first domain comes from 1sr9B, 1aw5, 1mzhA,
1h04B, 1rh9A, 1xi3B, ... 
(c.1.10.5, c.1.10.3, c.1.10.1, c.1.3.1, ...)
 or 1oc7A, 1tml, 1n7kA, ... ( c.6.1.1, c.1.10.1, ...)
 
The second domain comes from 1m3uA, 1ezwA, 1nal1, ... .
(c.1.12.8, c.1.16.3, c.1.10.1 ...)

All signs here point to a pair of TIM barrels as the models.

Fri Jun 23 21:14:33 PDT 2006 Kevin Karplus

I'm going to start a try using only 1j0dA and other c.79 folds.
(1qopB, 1o58A, 1v7cA, 1j0dA, 1fcjA, 1e5xA, 1kl7A, 1v7cA, 1pwhA, 1p5jA,
1f2dA are in t04.ids)

Fri Jun 23 21:29:52 PDT 2006 Kevin Karplus

try17 started on cheep.

Fri Jun 23 21:13:58 PDT 2006 George Shackelford

I'm stymied. Try15 makes little sense. Try16 turned away from what
try2 looks like (try2 looks like 1j0dA) and based itself on 1ps9A. As
a result, it ends up trying to form a barrel.

Try2 is closer in apperance to 1j0dA than to karypis.srv.2_tTS3. It
has two flat sheets like 1j0dA rather than barrel-like curves, and it
has a strand near the start that would make a match to the strand at
the start of 1j0dA. Kevin says it is based on karypis.srv.2_tTS3 and
I'm trying to trace that now.

I'm not surprised to see that kyarpis is based on TIM barrels. That
can account for the curved shapes of the two sheets.

Fri Jun 23 21:57:04 PDT 2006 Kevin Karplus

I'm going to start one more run---one that allows TIM-barrels but uses
the try17 costfcn that doesn't have built-in sheet constraints.

I might also want to start one that has sheet constraints from a 1j0dA
alignment. 

Fri Jun 23 22:04:13 PDT 2006 Kevin Karplus

try18 started on shaw.

Fri Jun 23 22:39:19 PDT 2006 Kevin Karplus

try17-opt1 looks like trash to me.

Sat Jun 24 00:44:52 PDT 2006 George Shackelford

I've done a search for sequences with similar ehl2 profiles and I have
come up with the following:
 1np7A 1bllE 1h1lA 1nxcA 1uwsA 1js3A 1owlA 1qh8A 1pgjA
I'm adding them to the manual top hits and generating extra alignments.
Then I'm going to do a try with only this set as a source for
alignments. Just trying something new.


Sat Jun 24 07:27:11 PDT 2006 Kevin Karplus

The cost functions try20 and try21 were set up using the sheet
constraints from alignments to 1j0dA and 1i60A, respectively.

The try20 costfcn likes try7 best of the optimized server models,
but likes our try1 and try16 ok.

The try21 costfcn likes our try1 best with the server-based try3 next
and our try16 after that.

I can do a 2-hour run for each of try20 and try21 before the deadline.
(try20 started on peep, using only 1j0dA and other c.79 folds)
(try21 started on cheep, using all-align)

Sat Jun 24 07:44:08 PDT 2006 Kevin Karplus

I'll put together another submission now, downrating all of try2
through try14 as being server based.

With the unconstrained costfcn, this gives us
	try18
	try16
	try1
for server-based models we can add
	try2	from karypis.srv.2_TS3
	try14	from SP4_TS5

Based on unconstrained, try20, and try21, I'll reorder that to
	try1
	try16
	try18
	try2	from karypis.srv.2_TS3
	try14	from SP4_TS5

Note: our previous submission had try2, try5, try1, try7, try14, so
this one just replaces try5 and try7 with try16 and try18, and moves
our models from alignment ahead of the server models.

Sat Jun 24 09:32:05 PDT 2006 George Shackelford

Though try17 scores better than try18, it has bad breaks and fails to
meet the constraints. I could live with the bad breaks if it at least
did better with the constraints, but it did not. Looks like a
desparate threading effort. Kevin will only include this if he wants
to replace one of the polished server models.

Sat Jun 24 09:58:48 PDT 2006 Kevin Karplus

try20-opt1 is also terrible, so I think that the 1jd0A idea is failing
badly.   All our half-decent models are TIM-barrel wannabes.

try21 may be decent,as it scores better than all but try1 with the
try21 cost function.  It comes pretty far down the list with the
unconstrained cost function.

Sat Jun 24 10:06:20 PDT 2006 Kevin Karplus

I don't particularly like try21, but I don't particularly like try1 either.
I think I'll replace try18 with try21 and give up on this target.

Sat Jun 24 10:12:02 PDT 2006 Kevin Karplus

So submitted.  Time to move on to the other 16 targets due this week.