6 Oct 1998 KEvin Karplus The "abstracts" have been moved to /projects/compbio/papers/casp3/abstracts Sun Jun 21 17:38:03 PDT 1998 Melissa Cline This directory serves as a repository for CASP3 predictions to be submitted, acknowledgements of those predictions, and programs for assembling predictions. Note that all that should be here is predictions that will be submitted. Interesting looking things that we might want to submit should stay in the target directory until we decide to submit them. This is to keep the submit direcotries from getting so cluttered that we can't remember what we did and didn't submit. Predictions may be submitted via email to submit@predictioncenter.llnl.gov or via the web at http://PredictionCenter.llnl.gov/casp3/submit/casp3-ver.html Here are some important files found in this directory: - registration.fold: CASP3 registration ID for the fold recognition team. - methods.fold: general description of target98 fold recognition. - remark: sample remark file with the names of the authors. - AlForNewFold: alignment record to be used when predicting a new fold. - fasta2al: program for producing an AL record from a FASTA-format alignment. - al2fasta: program for converting a prediction back into a FASTA alignemnt. - format_prediction: script for assembling a prediction record out of its component files.. Here are the steps for assembling a single prediction: 1. make a directory submit/. 2. In the directory submit/, write a file results. summarizing the results on that target: scores, best hits, quality of best alignments. Optionally, write a file methods. describing any special methods used on this prediction. 3. Create a methods file as follows: cat ../methods.fold results. > methods. 4. If not predicting a new fold, copy the prediction alignment to the directory of step 1. Create an AL record from the alignment as follows: ../fasta2al <-candidate align_filename> <-template templateName> \ [-target targetName] [-workingcandidate align_2_filename] \ > All switches can be abbreviated to two characters (eg: -ca instead of -candidate). If the alignment is pairwise, the target name will be deduced from the alignment. Otherwise, the -target switch is required to specify which sequence to include in the pairwise alignment with the template sequence. fasta2al will compare the target sequence in the alignment to the sequence in the file casp3//.seq. If the two versions of the sequence are different, it will print out an error message and halt. fasta2al will also compare the template sequence from the alignment with the SEQRES sequence produced by the pdb summary tool If the sequence versions are different, it will print out a warning message and use SAM to reconcile differences in the sequences. Any missing residues will be added to the target/template alignment as an insert in the template sequence, a '.' in the target sequence. If the -workingcandidate (abbrev: -wo) switch is specified, this alignment will be printed to the specified filename. sample command line: ../fasta2al -cand 1prcC-t54-global.pw.a2m -te 1prcC \ > 1prcC-t54.global.pw.al 5. Assemble a remark file with whatever contents you wish. Remark records generally don't affect the assessment of the prediction, and aren't to be distributed at the meeting with the methods records. Good things for the remarks section are a prettyalign of the alignment submitted, group name, name of authors, and any previous predictions that this prediction supercedes. 6. Assemble a prediction file from the components with format_prediction as follows: ../format_prediction <-target targetnum> <-methods methods_file> \ <-al al_file> [-remark remark_file] > sample command line: ../format_prediction -target T0054 -methods \ methods.t54 -remark remark.t54 \ -al 1prcC-t54-global.pw.al >submit.t54 If making a new fold prediction, use <-al ../ALForNewFold>. Note that the targetnumber should probably be as it appears in the CASP3 documentation: with a capital T and a bunch of zeros, eg. T0061 rather than t61. 7. This might not be necessary, but is safer; if making a new fold prediction, edit the submit file and change the first line from "PFRMAT AL" to "PFRMAT TS". 8. If not making a new fold prediction, verify the alignment with al2fasta as follows: ../al2fasta -output [-template template_name] \ < sample command line: ../al2fasta -output t54-1prcC.a2m < submit.t54 The -template switch is provided because at some later time, we will have the capability of concatenating multiple predictions. This feature is provided for extracting the alignment of one particular template structure. Do a prettyalign of align_file.a2m and the original alignment to verify that the alignments are consistent - that the beginning and end of the match columns are the same. 9. Mail the submit file to the email address listed above. Shortly afterwards, you will be mailed an acknowledgement. Save the acknowledgement in the submit directory. Extra details: domain hits off different templates ----------------------------------- When making non-overlapping predictions to different targets (eg. domain hits), the AL records should be concatenated. The overall model should start with the first domain hit and include a normal AL record. After the TER record, the next domain hit should begin with the PARENT record and continue down through the TER record. This is achieved by concatenating AL files in the correct order. chains in the target sequence ----------------------------- When the target sequence is broken into one or more chains, the chain ID must appear in the AL record immediately before the target sequence number. In other words, to specify an alignment to target chain A, instead of A 4 I 71 there should be A A4 I 71 predictions detailing protein interactions ------------------------------------------ See t66 for an example. A TS record should be submitted in place of an AL record. There's a conversion tool available via the CASP3 web pages. The coordinates of both proteins should be delimited with a single pair of PARENT and TER records. All parents involved in the interaction should be listed on this single PARENT line. Multiple Models --------------- To submit multiple models, a separate submission of each model is necessary. You must also alter the MODEL {1,2,3,4,5} field accordingly.