Tue Aug 10 11:22:29 PDT 2004 T0272 DUE 30 Aug 2004 Tue Aug 10 12:50:08 PDT 2004 Kevin Karplus This looks like a new fold, with no consensus between t02 and t04 methods. Tue Aug 10 15:21:59 PDT 2004 Kevin Karplus try1-opt2 looks like junk. The rr constraints are weak. Both t04 and t2k have similar secondary structure predictions, but one has stronger predictions where the other is weak (no clear preference). Multiple alignments have similar diversity. I'll make extra alignments, so that there are more starting points for subsequent runs in all-align.a2m, but I won't try to do any more work on this target today. Fri Aug 13 15:12:24 PDT 2004 Kevin Karplus I'm starting a short try2 run (running under valgrind) mainly to check undertaker for memory usage bugs. This is running on cluck and may take quite a while to finish, because of the overhead of all the valgrind checks. Sat Aug 14 14:56:49 PDT 2004 Kevin Karplus The try2 run died in a power failure. Rather than restarting it, I'll do a try3 run without valgrind. Maybe I should do a smaller valgrind test of undertaker at some point, but today I'm too busy trying to get predictions out. Sun Aug 15 22:31:48 2004 George Shackelford I've started looking at this. Looks like bad news to me. Weak signals all around, almost all of the matches are to hypothetical proteins giving us no real id on this protein. There may be something to gleam for the two str2 logos but not too much. I'll print them out and start a file. Wed Aug 18 14:11:01 2004 George Shackelford I checked the domain predictors and there is a possibility (slim) of a domain break at 110 or 130 (which is just past a helix. The try3 results done by kevin shows a distinct domain starting at 110. I may try to do separate domain searches and see what I get. Building 1-111 and 109-end domains. Wed Aug 18 17:37:36 PDT 2004 Kevin Karplus try3-opt2 looks like trash, but if George can find anything by breaking into domains, great. Personally, I think our problem here is distant relationships, and so no signals, rather than the sort of contamination or masking that breaking into domains can help with. Wed Aug 18 22:56:32 2004 George Shackelford I'd say Kevin is right. There is little more that came up in the split (a slightly better hit in the second 'domain.' The structure that Undertaker created for the second domain does look kind of nice - til we look at the ehl2 and near. I'm afraid this one is a triage case. I did notice that there were a lot more constraints for the thin35 than for thin62; this is an artifact of the distant matches that we're using here. I'm doing one more try on the whole sequence using some of those constraints. I also fixed the basic cost function (which was old) and I've added some of the t04 alpha constraints and I've commented out the t2k. Try4 running on cluck. Thu Aug 19 12:27:52 2004 George Shackelford Something happened to try4 on cluck. I did get try4-opt1 which looks plain weird but still has two domains. I need to get get 109-end/try2 running on crow. Fri Aug 20 17:10:40 2004 George Shackelford 109-end/try2 looks nice but it has strands that are not predicted to be strands. I can get a re-run of try4 again (as try5) and I need to get a second run of 1-111. Try5 running on crow. Sat Aug 21 11:17:57 2004 George Shackelford Try5 is awful: a few helices and loops, basically no sheets. I'm going to focus on getting 1-111,109-end to work and then paste. Sun Aug 22 12:19:15 2004 George Shackelford Working on 1-111 try2 - finally. I have added some thin-35 constraints as bonuses, adjusted the included constraints and boosted the t2k.ehl2 constraints. Let's see what we get. 1-111/try2 running on crow. Sun Aug 22 15:10:20 2004 George Shackelford Try2 looks like a mess. Looking back at 109-end, I see that we really don't have any consensus here. I'm going to work with the whole sequence again, even though it has not yielded anything. Sun Aug 22 21:16:45 2004 George Shackelford First I'm going to reenforce the strands I see there in ehl2. Then we're putting in some sheets where I think we need them. Try6 running on peep. Mon Aug 23 10:02:27 2004 George Shackelford This is as bad as T0273; the templates keep taking over and forcing solutions that are rather wrong based on moderate to strong ehl2 predictions. I wish Undertaker could distinguish 'weak' templates and defer more to the inserted constraints. Perhaps I should crank ALL constraints up! Yes! Try7 running on peep. Tue Aug 24 23:31:11 2004 George Shackelford Try7 is a mess. Looks like it's been stretched out on the rack. I'm going to do the 'extra_alignments' make and do a fresh run with the current constraints tuned down to 40 just to see what I can get. Still flailing away. Try8 running on ribbit. Wed Aug 25 11:02:02 2004 George Shackelford Try8 is a mess, but it's a better mess than try7. I'm going to continue with try8 and see what I can get. Hopefully I can get the sheets to at least form. Cranking up constraints (including sheets up to 30.0) and lowering break (again). Try9 on ribbit Thu Aug 26 10:13:31 2004 George Shackelford Try9 was no improvement but I did find I needed to change the sheet for s4||s7. I also boosted sheet constraints and constraints in general while relaxing the dubious helices around 150. This area froms a sheet actually. Starting try10 on peep. Thu Aug 26 12:44:34 2004 George Shackelford While try10 is running, Kevin and I looked at the try9. He found that the strand around residue 45 is part of a very bad break. This can be fixed by forming the sheet s2^vs3. Fri Aug 27 14:17:08 PDT 2004 Kevin Karplus I fixed the unconstrained.costfcn to refer to the right alpha files, and am creating score-all.unconstrained.pretty. The unconstrained costfcn likes try5 best, followed by try6, try4-opt1, try3, try10. I can't say that I care much for try5 though. try11 appears to be trying to form a sheet, but failing rather miserably at it. L44-L64 are badly shattered and P57-V58 is still a big break. I think we need some better ideas for sheet constraints. Question: is L168-V182 one strand or two? Robetta model1 makes a hairpin, pairing F176 with R179 (an unlikely pairing). F176-L180 seems quite possible. Maybe I should do a run with just the strand and helix constraints turned up high, and see what (if anything) falls out. Fri Aug 27 14:31:30 PDT 2004 Martina Koeva I have been looking at the alignments the t2k.undertaker-align.pdb and some of the sheets seemed to fit our predictions. Also, we have a somewhat weakly predicted helix around A120-E132 at least in some of the predictions, so the pairing of strands in that region is not completely out of question. I will try to extract the sheet constraints from the alignments and see whether anything useful would come from there. Fri Aug 27 14:41:11 PDT 2004 Kevin Karplus If I add a sheet constraint for the putative hairpin: SheetConstraint A173 F176 V183 L180 2.0 # hbond ? then the robetta models 10,2,6,3,5 move to the top, followed by try3-opt2, and robetta models 1,8,4,9,7. Perhaps I should try an optimization with this cost function (which will mainly polish the robetta models). Fri Aug 27 14:53:35 PDT 2004 Martina Koeva Alignments 2 and 7 don't look bad and generate a few sheet constraints. The only problem that I see with them right now is that the last few sheet constraints in align2.sheets seem to involve residues in a strongly predicted helix E186-G195, but I guess I can leave them out. I will start a try13 (align2.sheets) and try14 (align7.sheets). I have turned down constraints from 80 to 30, as well as dry6.5 - from 40 to 25. Fri Aug 27 18:18:00 PDT 2004 Martina Koeva Try13-opt1 is ready and the unconstrained function really does not like it, but it does look like it's attempting to form a sheet. Fri Aug 27 19:18:49 PDT 2004 Kevin Karplus The best currently with the unconstrained costfcn is try12-opt2, which was optimized from the robetta models. It looks like it is trying to form a sheet. Maybe we should make some sheet constraints that would match its topology and try to get some real sheets. Sat Aug 28 08:17:49 PDT 2004 Kevin Karplus try13-opt2 actually forms some sheet! It looks like W2-L8 should be somehow parallel to R43-E49 with W46 having the hbonds---maybe 1> mWLTKLVLn 43> rLLWLRLep SheetConstraint W2 L8 R43 L49 hbond W46 It isn't really so surprising that try13 forms sheets, since it was based on an alignment that had sheets. The problem is with bad breaks. Sat Aug 28 11:24:59 PDT 2004 Kevin Karplus try15.costfcn keeps the sheet that I like in try13, but tries adding another strand (s1 as defined in "strands"). The result (try15-opt2) gets some strand pairing for s1, but not the requested pairing. The biggest problme with try15-opt2 seems to be the exposure of the buried sheet. We have to get the helices to cover the sheet. Sat Aug 28 13:08:17 PDT 2004 Kevin Karplus I'd better get something running while I go out and do errands. Perhaps a review of what sheets have formed or almost formed may help. try12-opt2: s3||s4 s4 ^v s3a s4 ^v s5 ? not real close s7a ^v s7b s7a ^v s1 try13-opt2 s1 || s2 s2 ^v s3 s3 ^v s4 s4 ^v s5 try15-opt2 s2 ^v s3 s3 ^v s4 s4 ^v s5 s6 ^v s1 not real close s1 || s3a We have some major disagreements here! Putting together a consensus (or even adding consistently to one of these) will be tough. I'm too fuzzy-headed to think right now. Maybe I'll run my errands, get some lunch, and try again. Sat Aug 28 15:15:13 PDT 2004 Kevin Karplus Let's try s2 ^v s3 ^v s3a ^v s4 ^v s5 ^v s6 || s6a || s7a ^v s7b || s8 (letting s1 fall where it may). This is a pretty dumb topology, but it looks marginally feasible. s2: 42> erLLWRLep s3: ltQVLVVpp < 56 s3a: 75> gYAQVFpp s4: nARLRFRLRqg < 92 s5: 103> paKRLaa s6: ptkLAVRKg <111 s6a: geLLRfgg <133 s7a: FLVAQVQLLKg < 166 s7b:177> egRLEVVd s8: 204> lgLLSVap Sat Aug 28 17:31:35 PDT 2004 Kevin Karplus Although it does not score all that well with the unconstrained costfcn, try16-opt2 does seem to make some feasible sheets. I should add s1 || s2 and change s7b || s8 to s7a ^v s8. s1: 2> WLTKLVLnp s2: 42> erLLWRLep s6a: geLLRfgg <133 s8: 201> alglgLLSVap s7a: FLVAQVQLLKg < 166 s7b:177> egRLEVVd Sat Aug 28 20:08:27 PDT 2004 Kevin Karplus try17-opt2 scores even worse than try16-opt2 with the unconstrained cost function, though the sheet s2 through s5 is looking ok. I don't know if I should go on guessing sheet constraints (which I'm probably getting wrong), polish up the crummy models we currently have, or just give up and submit the trash that we've got so far. The deadline is noon Monday, so I have another day to make up my mind. I'm tempted just to give up---burned out at the end of CASP season. Maybe I'll try a polishing run from try14, which has the most beta sheet of any of our models. Sat Aug 28 21:00:34 PDT 2004 Kevin Karplus I spent a little time looking for homologs that had any information---there wasn't much, and what there was was rather dubiously obtatined by possibly distant similarity to other proteins. One homolog was labeled an hrpA-like helicase, but that family of helicases does not seem to have known structure. Another was labeled a transcriptional regulator. I suspect that both labelings may have been just recognizing an RK-rich protein, which is quite likely to bind to negatively charged DNA. Sun Aug 29 03:05:27 PDT 2004 Martina Koeva I just rescored all models that we have and try12-opt2 is still doing the best with try5-opt2 and try18-opt2 following closely behind. From what I can tell the advantage of try12-opt2 comes mostly from the break cost (6-point difference) and try18 still looks pretty shattered, but that would probably be at least somewhat fixed during polishing. I am not quite sure (given the time and energy levels) that we can do much on new topologies, but I think it's worth polishing up some of the existing models that we have. I started try19 from all try18 models and turned up the break weight, phobic fit (as well as dry6.5), since the largest advantage of try12 over try18 came from them. Sun Aug 29 08:11:24 PDT 2004 Kevin Karplus try19-opt2 does now beat try12-opt2 on the unconstrained costfcn. I think that the s3a ^v s2 ^v s4 ^v s5 ^v s6a sheet looks fairly good. We may want to try extending it. One possibility is to swing the helix connecting s1 to s2 against the sheet, and make s1 || s6a We could try puting V33 near L45, L29 near L99, and H26 near R106 to pack the helix on that side of the sheet. We may need to repel V156 from L99 to make room. I put such constraints into try20.costfcn, and tweaked the weights of constraints and the hbond terms until try19-opt2 barely beat try16-opt2. Sun Aug 29 10:34:43 PDT 2004 Kevin Karplus try20-opt1 does seem to have done crossover to get the body of try19-opt2 and move strand s1, though not quite to where it was requested to go. It has paid a high price in breaks and clashes, so may need to be reoptimized with a more normal cost function. Sun Aug 29 11:43:03 PDT 2004 Kevin Karplus try20-opt2 is essentially no better than try20-opt1. I suppose it could be polished, but s1 is in a bad place and would need to move. Maybe a short polishing run would be worthwhile, just to see if it could go anywhere. With the try21 costfcn which just has the helices and strands from t04, plus the sheets from try20-opt2, the best models are try19, try18, try14, and try20. The unconstrained costfcn still orders the models try19, try12, try5, try18, try13, try14, try3 Sun Aug 29 14:08:41 PDT 2004 Kevin Karplus With the short optimization run, try21-opt2 is about the same as try18-opt2 on the try21 costfcn, but still trailing try19. On the unconstrained costfcn, it is still worse than try3. If I had to submit now (and we are running out of time!), I'd submit try19-opt2 best unconstrained try12-opt2 reoptimized robetta try13-opt2 decent sheet try1-opt2 full auto I'm not sure what to include to make up 5 models. Do we have anything else decent that is different? The first template alignment is terrible, but the second one is OK, so I could include it. T0272-1md8A-t04-global-adpstyle1 4th best t2k template (Actually, the second model in the t2k.undertaker-align file is the 4th template in t2k.best-scores.) For now, I'll just submit these 4, and resubmit if anyone has any ideas for a fifth model. From martina@soe.ucsc.edu Sun Aug 29 20:05:41 2004 MIME-Version: 1.0 Date: Sun, 29 Aug 2004 20:05:36 -0700 (PDT) From: Martina Koeva To: Kevin Karplus cc: sol@soe.ucsc.edu, , , , , , Subject: Re: giving up on T0272 In-Reply-To: <200408292146.i7TLk8JI005812@cheep.cse.ucsc.edu> I think that it's worth including the second template alignment from t2k.undertaker-align (as Kevin suggests in the README file). If it is the same alignment that I think it is, we directly used the sheet constraints from that alignment (through align2.sheets) for try13, which is one of the models that we are submitting. Another possible alignment would be: T0272-1svy-t2k-local-str2+CB_burial_14_7-0.4+0.4-adpstyle5 (7th model in t2k.undertaker-align).We used that alignment to generate the sheet constraints used for try14. Martina On Sun, 29 Aug 2004, Kevin Karplus wrote: > > I'm giving up on T0272. I submitted 3 of our "best" models and > try1-opt2, but I couldn't find a sufficiently different 5th model that > was worth submitting. Anyone have a favorite I missed? Anyone want > to flail around some more with this target? > Mon Aug 30 09:43:08 PDT 2004 Kevin Karplus I added T0272-1md8A-t04-global-adpstyle1 =T0272.t2k.undertaker-align.pdb model 2 as model5. Thu Nov 18 23:54:17 PST 2004 Martina Koeva Based on the smooth gdt scores: best sam-t04 12.8029 (also model3) best submit 12.8029 (model3) model1 9.4145 auto 10.2306 align 4.3231 robetta best 13.6318 (robetta model2) robetta1 11.9169