Mon Jun 28 09:39:39 PDT 2004 T0216 Due 11 Aug Looks like it may be new fold or difficult fold recognition. Fri Jul 16 - ggshack Doing a try2 with an initial constraint from the 134.rr.constraint file. Sat Jul 17 17:20:37 PDT 2004 ggshack Doing try3 with a couple of new RR inspired constraints. I am still using the 134.rr.constraints since I haven't run anything new. Mon Jul 19 15:49:08 PDT 2004 ggshack It is looking a bit like two domains. I'm going to introduce a constraint from RR constraints to bring a prediction back into the fold. There are a lot of high scoring predictions for this. I am going to take a bunch and run with bonus_constraints. The rr predictions are from the new 280 predictor. Starting try4 run. Tue Jul 20 10:00:25 PDT 2004 ggshack Results of try4 imply that I might try without bonus_constraints. I'll give that a run. Or do I just turn up strands and turn down helices? Doing a TRY5 without bonus_constraints for now, but I will consider doing the adjust to strands/helices. I also adjusted the RR constraints to a max distance of 18.0. Tue Jul 20 20:07:46 PDT 2004 ggshack The results of try5 weren't as good looking as try4. I think I'm going to try and split this into 2 domains of equal size. That is what I have found from looking at try3 again. Now how is this done... Wed Jul 21 23:13:21 PDT 2004 ggshack It work...except for the calls to traincontactnn. I need to get the RR constraints built. Ah, I need to call my make.rr manually with a special TARGETDIR. I'll give that a try. Yes! that worked! But I find that the highest scoring RR's are in the second half of the sequence, so I am starting the process of building the files for the 2nd domain. I'll check it tomorrow. Fri Jul 23 16:47:19 PDT 2004 ggshack Now building TRY2 for domain 2. Using a bit of the RR constraints as bonus. Also commented out all constraints from t2k!! Fri Aug 6 15:11:37 PDT 2004 Kevin Karplus George, it might be a good idea to reduce the strand and helix constraints to just those from t04.dssp-ehl2.constraints. You may want to pick up the constraints from the subdomains, rather than whole chain. Using an unconstrained cost fcn, try4 scores best, then try3, robetta-model3, try2, try5, robetta-model6, ... Try4 and try3 look rather junky, but robetta-model3 has some good sheets. Maybe sheet constraints should be extracted from that. I'll make a try6.costfcn that takes strand and helix constraints from t04.dssp-ehl2 and sheet constraints from robetta-model3, as well as rr constraints. With this cost function, the best-scoring are robetta models 3,6,1,7,2, followed by try5-opt2, robetta models 9,10, try4-opt2, robetta models 4,8,5, ... I'll try running try6 from the alignments, though the alignments may not be good enough to get things right. I'm including the sub-domain alignments in 1-218/all-align.a2m and 216-435/all-align.a2m. I just noticed that George has a lot of notes in the subdomain README files. If he is serious about what he's doing there, he'll have to paste it together soon to create a complete model. I'm not so sure he has the domain break in the right place, since R182-A190 may well be parallel to K222-R228. I see no notes on how George decided that it was two domains, nor how he decided where the domain break should be. From karplus@soe.ucsc.edu Fri Aug 6 22:38:03 2004 Date: Fri, 6 Aug 2004 22:38:02 -0700 From: Kevin Karplus To: ggshack@soe.ucsc.edu CC: karplus@soe.ucsc.edu Subject: T0216 T0216 is due very soon (Wed 11 Aug), and I don't see a lot of progress in the main README file. There are a few rather cryptic notes in the 216-435/README file, but I can't tell whether you are making progress or not. Maybe we should meet Saturday morning at my house to go over the protein---you can show me what progress you've made and what you are trying to do. I don't even know how you decided that 217 was the domain boundary---it doesn't look like one to me. Domain predictors have different opinions: DomPRED: break at 300 DomSSEA: single domain Ginzu: break at 135, 292 RosettaDom: break at 272 ssep-align: break at 250, 328 None of them that I have found predicted a break near 217. Fri Aug 6 23:46:14 PDT 2004 George Shackelford I took a stab at the possible break based on what I could determine from possible rr contacts; that suggested two domains with a break somewhere near the middle. The break at 272 or breaks at 250, 328 would match best. Robetta showed three equal parts but that seemed wrong. I have separate notes for the two domains. See the README in subdirectory 1-218 and old1-218. I had to rebuild 1-218 because of some failure during the initial build. I do need to combine the notes. In general I feel I am floundering. I could really use some help but I also have a wedding in Bodega Bay tomorrow. I can meet early around 8 to 9 and again on Sunday afternoon. Sat Aug 7 09:07:13 PDT 2004 Kevin Karplus George based the domain split on robetta models on try2 and try3, and on the clustering of the rr contacts in the second half. George commented on the dip in strand prediction at K378---possibly being an antiparallel match after that but a short match for the first part. try6 does not look good. It scores a bit better than the robetta models, mainly based on predicted alpha cost functions. It does get a number of the rr constraints, but the sheets look bad. The best-scoring with the unconstrained file is still try4, which looks like junk. Sun Aug 8 23:08:40 PDT 2004 George Shackelford We're just going to bring this to a close. I created a file, superimposed.under and using the T0216.try3-opt2.pdb as the base... Oops! That's not what I did. I just did a superpose of old1-218/decoys/T0216.try9-opt2.pdb as the domain one and 216-435/decoys/T0216.try6-opt2.pdb.gz as domain two and I got best-super.pdb. I need to clip and paste (I think) to get the final file. I'll check tomorrow. I think I'll do a second one using T0216.try3-opt2.pdb as the base and see what I get. Mon Aug 9 09:56:02 PDT 2004 George Shackelford Using try3 as a base for the domains gets too much of a collision (as seen in best-super2.pdb). I think I'll stick with the first approach (as seen in best-super.pdb). Also I believe he said to do no-break runs of the best domains to close gaps. I'll double check that. Otherwise we have a go. I'm going to patch the two domains together, combine their cost functions, and refine what I have to a final submission (however bad it may be). Combined costfcn of old1-218/try9.costfcn 216-435/try6.costfcn Mon Aug 9 10:56:43 PDT 2004 Kevin Karplus There is a best-super.pdb and a best-super2.pdb. I'm guessing that the newer best-super2.pdb is the one George meant, but it definitely needs some cut-and-paste work, as it contains multiple models. When George has reduced it to a single, complete model, the result should be put in decoys, so that it can be scored (perhaps with a cost function whose constraints come from the subdomains). If it scores ok, we can do a polishing run and submit it. I'm sure it is wrong, but we don't have anything right to submit. Mon Aug 9 20:01:35 PDT 2004 George Shackelford For polishing, I need to put the contents of read-pdb.under in my try8.under. Actually I only need the line that refers to best-model.pdb, although the others might help as well(?). I'll see what I get. TRY8 now running on cluck. Tue Aug 10 00:16:21 PDT 2004 George Shackelford I figure that I should at least do a run with the other lower scoring tries as part of 'read-pdb.' That might just provide a different view. TRY9 running on ribbit. Tue Aug 10 11:44:06 PDT 2004 George Shackelford I like try8 better than try9. Both are foamy but try8 looks better. I'm going to crank up the dry stuff and include some robetta sheets (just to see) for try10. Robetta model 3 sheets look ok, and it also scores well in unconstrained. TRY10 running on ribbit(?) Tue Aug 10 18:15:06 PDT 2004 George Shackelford I can't wait for try10 to submit. I'm selecting them now. I basically have try9-opt2 try8-opt1 try2-opt2 try3-opt2 try1-opt2 Now working on the model files. Tue Aug 10 19:59:34 PDT 2004 Kevin Karplus There are no Rosetta-repacked files for this target, because the MAX_RES limit was exceeded. The best-scoring models with the unconstrained costfcn are try8-opt2 try9-opt2 robetta-models 10,2,3 try10-opt1 robetta-model 6 try4-opt2 T0216.best-model.pdb (where did THAT one come from?) robetta-model 1 try3-opt2 robetta-model 7 try2-opt2 robetta-model 4 try1-opt2 It would be useful to know WHY George chose try9 over try8, and why try2 over try3 and try4. There is some dicussion in the method files. I modified unconstrained.costfcn, turning down the soft_clashes and breaks parameters, since the models are too crude to worry much about the fine detalis. The order (ignoring robetta models) is try8-opt2 try9-opt2 try10-opt1 try3-opt2 try6-opt2 try4-opt2 try5-opt2 try2-opt2 try1-opt2 ... From karplus@soe.ucsc.edu Tue Aug 10 20:12:46 2004 Date: Tue, 10 Aug 2004 20:12:45 -0700 From: Kevin Karplus To: ggshack@soe.ucsc.edu CC: karplus@soe.ucsc.edu Subject: one reason for trouble with T0216 George, One reason you were having trouble with T0216 is that the hbond cost functions were still set to the old weights that were appropriate before the hbond cost functions were rescaled. I sen email to everyon on 21 July about the rescaling. As a result your optimizations were not trying very hard to create or hold onto hbonds. Kevin ============================================================ Tue Aug 10 20:23:55 PDT 2004 Kevin Karplus Lacking any explanation from George about his reasoning in choosing the 5 models, I'm going to rearrange the list to cover more ground: try8-opt2 best-scoring unconstrained.costfcn, example of pasted-together from 2 domains try3-opt2 best scoring before trying separate domains The model that suggested a 2-domain solution. try6-opt2 try4-opt2 try1-opt2 fully automatic If try10 ends up looking better than try8, it can replace it. Hmm---try4 may not be such a great choice---tries 8 and 9 score better with the try4 cost function. Still it seems to provide a good diversity. From ggshack@pacbell.net Tue Aug 10 20:39:55 2004 MIME-Version: 1.0 From: George Shackelford To: Kevin Karplus Subject: Re: one reason for trouble with T0216 Date: Tue, 10 Aug 2004 20:39:53 -0700 In-Reply-To: <200408110312.i7B3Cjot007707@cheep.cse.ucsc.edu> Content-Disposition: inline X-Spam-Checker-Version: SpamAssassin 2.64 (2004-01-11) on fs.cse.ucsc.edu X-Spam-Level: X-Spam-Status: No, hits=0.0 required=3.0 tests=none autolearn=no version=2.64 Apparently I did have the correct values when doing 216-435 models but the old ones when doing 1-218. On Tuesday 10 August 2004 08:12 pm, you wrote: > George, > > One reason you were having trouble with T0216 is that the hbond cost > functions were still set to the old weights that were appropriate > before the hbond cost functions were rescaled. I sen email to everyon > on 21 July about the rescaling. As a result your optimizations were > not trying very hard to create or hold onto hbonds. > > Kevin ============================================================ Tue Aug 10 20:50:02 PDT 2004 Kevin Karplus I submitted try8-opt2 best-scoring unconstrained.costfcn try3-opt2 best scoring before trying separate domains try6-opt2 try4-opt2 try1-opt2 fully automatic For some reason, George did not include all the previous models when doing the try10 optimization, so try10-opt1 does not score even as well as try8-opt2 with the try10.costfcn. Unless try10-opt2 looks a lot better than try8-opt2, I'm inclined to stick with the ones we've submitted. They're all wrong anyway, so it probably doesn't matter much. Fri Sep 24 20:52:29 PDT 2004 Kevin Karplus Robetta beat us on this one, but the smooth_GDT scores are all so low that the difference is insignificant. I was right that the predictions were all wrong. name length missing_atoms rmsd rmsd_ca GDT smooth_GDT robetta-model7.pdb.gz 435 0.0000 29.1765 28.9116 -7.4766 -7.3029 robetta-model8.pdb.gz 435 0.0000 30.0832 29.9499 -7.2430 -7.2605 robetta-model1.pdb.gz 435 0.0000 27.3194 27.1572 -6.9509 -7.0521 robetta-model10.pdb.gz 435 0.0000 29.2685 28.9489 -7.0093 -6.8166 robetta-model4.pdb.gz 435 0.0000 30.9365 31.0439 -6.8341 -6.7685 T0216.try5-opt2.pdb.gz 435 0.0000 27.9592 27.5679 -6.7173 -6.4765 model2.ts-submitted 435 0.0000 28.9246 28.4839 -6.3668 -6.4539 robetta-model6.pdb.gz 435 0.0000 27.3241 27.0722 -6.2500 -6.1634 model4.ts-submitted 435 0.0000 26.8172 26.6597 -5.7243 -5.9985 robetta-model5.pdb.gz 435 0.0000 27.7971 27.4259 -5.8411 -5.9062 model1.ts-submitted 435 0.0000 26.6566 26.3267 -5.7827 -5.9005 robetta-model9.pdb.gz 435 0.0000 28.8225 28.9140 -6.2500 -5.8578 model5.ts-submitted 435 0.0000 31.3190 30.9191 -6.1332 -5.8333 robetta-model3.pdb.gz 435 0.0000 27.8584 27.4620 -5.6659 -5.8108 T0216.best-model.pdb.gz 435 0.0000 28.3387 28.1215 -5.4907 -5.6690 model3.ts-submitted 435 0.0000 25.8253 25.6111 -5.7243 -5.5664 robetta-model2.pdb.gz 435 0.0000 27.3316 27.2602 -5.1986 -5.4496