Evaluation of threading submissions 12/1/1996 =================================== Aron Marchler-Bauer, Steve Bryant, for CASP2 Organizers CASP2 - threading/fold recognition submissions ---------------------------------------------- Submissions in the category threading/fold recognition are evaluated auto- matically by comparing them to the results of structure-structure comparison methods. The predictors had to use a standardized format for their submissions, cautious verification of the format during the submission process made the data likely to be free of syntax errors. The same format has been used for the results of structure-structure comparison methods, so that they could be verified in the same way. The evaluation software compares one threading/fold recognition-submission against one structure comparison result, which is used as the 'standard of truth', at a time. Example of the format: ---------------------- The format is described in detail in: "http://www.mrc-cpe.cam.ac.uk/casp2/fr-submission.html" (UK) or "http://iris4.carb.nist.gov/casp2/fr-submission.html" (USA). Here's an example of how a simple prediction submission for one of the targets might look. This submission reports the result from a comparison of the target sequence T0021 agains a small database of 7 domains, with one hit against the domain 1ALA _ 1 reported: PFRMAT FRV1 REMARK -------------------------------------------------------- AUTHOR xxxx-xxxx-xxxx REMARK -------------------------------------------------------- TARGET T0021 REMARK -------------------------------------------------------- SEQRES T0021 GAKEPDPDKLKKAIVQVEHDERPAR SEQRES T0021 LILNRRPPAEGYAWLKYEDDGQEFE SEQRES T0021 ANLADVKLVALIEG REMARK -------------------------------------------------------- TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 1.0 1ALA _ 1 TSCORE T0021 0 0.0 8ACN _ 1 TSCORE T0021 0 0.0 1GKY _ 1 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1UBI _ 0 REMARK -------------------------------------------------------- TALIGN T0021 0 3 59 1ALA _ 1 260 316 1.0 1 REMARK -------------------------------------------------------- STRSUB 1ALA _ 1 3 98 STRSUB 1ALA _ 1 247 318 STRSUB 1ALA _ 2 89 246 STRSUB 8ACN _ 1 67 106 STRSUB 8ACN _ 1 122 199 STRSUB 8ACN _ 1 516 520 STRSUB 1GKY _ 1 0 32 STRSUB 1GKY _ 1 83 186 STRSUB 1THT A 2 163 184 STRSUB 1DSB A 1 21 61 STRSUB 1DSB A 1 151 188 The results from a structure comparison search could just look the same, except for the format descriptor reading: PFRMAT SCV1 and some additional records describing the quality of the similarity: RMSIDE T0021 0 1ALA _ 1 57 7.79 7.02 the three numbers are: the length of the structural alignment in residues, the RMS-Deviation between aligned structures in Angstroem, and the percentage of identical residues in the (structural) alignment. Evaluation Criteria ------------------- A detailed description and discussion of the criteria used for the evaluation can be found in: "http://www.mrc-cpe.cam.ac.uk/casp2/fr-criteria.html" (UK) or "http://iris4.carb.nist.gov/casp2/fr-criteria.html" (USA). The evaluation software calculates and reports the following numbers: 1) Prediction confidence 2) Size of the prediction database 3) Number of hits with non-zero score 4) Number of correct hits 5) Maximal possible number of correct hits 6) Threading specificity 7) Threading sensitivity 8) Best Case specificity 9) Chance (best case) specificity 10) Alignment RMS-deviation 11) Number of correctly aligned residues 12) Alignment specificity 13) Alignment sensitivity 14) Alignment mean shift error 15) Alignment coverage 16) Alignment contact specificity 17) Alignment contact sensitivity 18) Structure comparison alignment weight 19) Structure comparison length of alignment 20) Structure comparison RMSD 21) Percentage of identical residues in structure comparison alignment 22) Prediction alignment weight 23) Prediction alignment length 24) Percentage of identical residues in prediction alignment These are discussed in detail below: 1) Prediction confidence ("Conf") --------------------------------- The predictors were asked to specify the overall confidence in their prediction. They could do so by assigning a non-zero score to a dummy structure "NONE". Assigning 100% of the score to "NONE" would mean that the author(s) of the prediction think that none of the structures listed in the prediction are suitable three-dimensional models for the target sequence. Assigning 50% of the overall score to "NONE" would mean that the predictor(s) thought that the odds of having found the right answer would be 50%. Example: The example uses the "submission" listed above. The same object is used as the actual "submission" and as the "standard of truth". The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.79 57 100 100 0 100 100 100 if the first two "TSCORE-records" are changed to: TSCORE T0021 0 0.5 NONE _ 0 TSCORE T0021 0 0.5 1ALA _ 1 the reported prediction confidence changes to: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 50 7 1 1 1 100 100 100 14.29 7.79 57 100 100 0 100 100 100 \_____/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Names+ | | | | | | | | | | | | | | | | | | | Target -----+ | | | | | | | | | | | | | | | | | | Subset -------+ | | | | | | | | | | | | | | | | | Confidence --------+ | | | | | | | | | | | | | | | | Size of the Database ---+ | | | | | | | | | | | | | | | No. of Non-Zero Scores ------+ | | | | | | | | | | | | | | No. of correct hits in Submission -+ | | | | | | | | | | | | | No. of possible correct hits -----------+ | | | | | | | | | | | | Threading Specificity -----------------------+ | | | | | | | | | | | Threading Sensitivity ----------------------------+ | | | | | | | | | | Best Case Specificity ---------------------------------* | | | | | | | | | Chance Specificity ------------------------------------------+ | | | | | | | | Average RMS-Deviation between Model and Target Structure ---------+ | | | | | | | Average Number of correctly aligned residues ---------------------------+ | | | | | | Average Alignment Specificity -----------------------------------------------+ | | | | | Average Alignment Sensittiviy ----------------------------------------------------+ | | | | Average Mean Shift Error --------------------------------------------------------------+ | | | Average Alignment Coverage -----------------------------------------------------------------+ | | Average Contact Specificity ----------------------------------------------------------------------+ | Average Contact Sensitivity ----------------------------------------------------------------------------+ 2) Size of prediction database ("TDbs") --------------------------------------- The predictors were asked to submit, along with the hits having non-zero score, and corresponding alignments, hits to all other database proteins used in their search procedure, with a score of 0. The number of database structures listed in the submission is reported. 3) Number of hits with non-zero score ("TNt0") ---------------------------------------------- This is the actual number of hits, as found in the prediction, which have been assigned a non-zero score, even if this score is very small and represents a neglectible fraction of the total prediction's bet. 4) Number of correct hits ("TCrct") ----------------------------------- Out of all the prediction's hits with a non-zero score ("TNt0"), some might be "correct hits" - i.e. in the structure-structure comparison results, used as the standard of truth, hits to the same database chain were reported with a non-zero score too. 5) Maximal possible number of correct hits ("TCmx") --------------------------------------------------- This is the number of hits, having a non-zero score in the structure-structure comparison results, which are listed in the prediction's search database. "TCmx" represents the maximal possible value for "TCrct", given the search database used for the prediction and the results of the structure-structure comparison method. 6)Threading Specificity ("TSpc") -------------------------------- The Threading Specificity is calculated as the percentage of the "bet", that was placed on structures which are similar to the target protein. Predictors could bet on no structures - (except for the dummy "NONE", the specificity is 0 by default in this case) - on one or on several structures. The amount of the bet placed on actual structures is normalized to sum up to 1.0 for the prediction evaluation. The confidence listed in the column labeled "Conf" is therefore not used for calculating the threading- and alignment-specific quantities listed and described below. Structure comparison methods assign scores to individual matches of the target structure to a database structure, or simply report whether a database structure is similar to the target or not. For the calculation of threading specificity, scores from the submission are normalized to sum up to 1.0, while structure comparison scores are scaled down, if necessary, so that their maximum is 1.0, these scores are interpreted as the probability, as assessed by the structure-structure comparison method, that the two structures are actually similar. Threading specificity is then calculated by multiplying the values for all matching TSCORE-records and summing up: Score assigned to correct hit i Str.Comp. Score for hit i Thr.Spec. = 100 * sum ( ------------------------------- * ------------------------- ) (i) Sum of Scores for all hits max(1,Str.Comp. Scores) Example 1: ---------- Submission Structure comparison TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 1.0 1ALA _ 1 TSCORE T0021 0 1.0 1ALA _ 1 TSCORE T0021 0 0.0 8ACN _ 1 TSCORE T0021 0 0.0 8ACN _ 1 TSCORE T0021 0 0.0 1GKY _ 1 TSCORE T0021 0 0.0 1GKY _ 1 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1UBI _ 0 TSCORE T0021 0 0.0 1UBI _ 0 In the example above, the prediction exactly matches the structure comparison, threading specificity is calculated like the following: *) normalize submission scores (doesn't change the column in this case) *) matching pairs are T0021 0 1ALA _ 1 <---> T0021 0 1ALA _ 1 *) multiply values for all the matching pairs: 1.0 * 1.0 *) sum up to: 1.0 (which is 100% of what's possible) The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.79 57 100 100 0 100 100 100 Example 2: ---------- Submission Structure comparison TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.6 1ALA _ 1 TSCORE T0021 0 1.0 1ALA _ 1 TSCORE T0021 0 0.3 8ACN _ 1 TSCORE T0021 0 0.0 8ACN _ 1 TSCORE T0021 0 0.1 1GKY _ 1 TSCORE T0021 0 0.0 1GKY _ 1 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1UBI _ 0 TSCORE T0021 0 0.0 1UBI _ 0 In this example, the prediction assigns scores to three different structures, unlike the structure comparison, the threading specificity is calculated like the following: *) normalize submission scores (doesn't change the columns in this case) *) matching pairs are T0021 0 1ALA _ 1 <---> T0021 0 1ALA _ 1 T0021 0 8ACN _ 1 <---> T0021 0 8ACN _ 1 T0021 0 1GKY _ 1 <---> T0021 0 1GKY _ 1 *) multiply the values for all the matching pairs: 0.6, 0.0, 0.0 *) sum up to: 0.6 (which is 60% of what's possible) The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 3 1 1 60 100 100 14.29 7.79 57 100 100 0 100 100 100 Example 3: ---------- Submission Structure comparison TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.6 1ALA _ 1 TSCORE T0021 0 0.5 1ALA _ 1 TSCORE T0021 0 0.3 8ACN _ 1 TSCORE T0021 0 1.0 8ACN _ 1 TSCORE T0021 0 0.1 1GKY _ 1 TSCORE T0021 0 0.0 1GKY _ 1 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1UBI _ 0 TSCORE T0021 0 0.0 1UBI _ 0 In this example, the prediction assigns scores to three different structures, the structure comparison has found one structure being only marginally similar to the target structure, the threading specificity is calculated like the following: *) normalize submission scores (doesn't change the columns in this case) *) matching pairs are T0021 0 1ALA _ 1 <---> T0021 0 1ALA _ 1 T0021 0 8ACN _ 1 <---> T0021 0 8ACN _ 1 T0021 0 1GKY _ 1 <---> T0021 0 1GKY _ 1 *) multiply the values for all matching pairs: 0.3, 0.3, 0.0 *) sum up to: 0.6 The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 3 2 2 60 66.67 100 64.29 6.28 32.5 100 100 0 100 100 100 7) Threading sensitivity ("TSns") --------------------------------- Threading sensitivity is the percentage of similar structures, present in the prediction submission search set, that were detected as being similar by the prediction method. It is calculated as the percentage of the structure comparison methods "bet" that is correct, when compared against the threading/fold-recognition submission. To calculate this number, the same procedure is used as in the calculation of threading specificity, but the schemes of score normalization are swapped. Score assigned to correct hit i Str.Comp. Score for hit i Thr.Sens. = 100 * sum ( ------------------------------- * ------------------------- ) (i) max(Prediction Scores) sum(Str.Comp. Scores) For the examples from above: Example 1: ---------- *) max. submission score is set to 1.0 (doesn't change the columns) *) normalize structure comparison scores (doesn't change the columns) *) matching pairs are T0021 0 1ALA _ 1 <---> T0021 0 1ALA _ 1 *) multiply values for all the matching pairs: 1.0 *) sum up to: 1.0 (which is 100% of what's possible) The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.79 57 100 100 0 100 100 100 Example 2: ---------- *) max. submission score is set to 1.0 - this changes 0.6, 0.3, and 0.1 to 1.0, 0.5, and 0.16666667 *) normalize structure comparison scores (doesn't change the columns) *) matching pairs are T0021 0 1ALA _ 1 <---> T0021 0 1ALA _ 1 *) multiply values for all the matching pairs: 1.0 *) sum up to: 1.0 (which is 100% of what's possible) The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 3 1 1 60 100 100 14.29 7.79 57 100 100 0 100 100 100 Example 3: ---------- *) max. submission score is set to 1.0 - this changes 0.6, 0.3, and 0.1 to 1.0, 0.5, and 0.16666667 *) normalize structure comparison scores: this changes 0.5 and 1.0 to 0.33333333 and 0.66666667 3) matching pairs are T0021 0 1ALA _ 1 <---> T0021 0 1ALA _ 1 T0021 0 8ACN _ 1 <---> T0021 0 8ACN _ 1 T0021 0 1GKY _ 1 <---> T0021 0 1GKY _ 1 *) multiply values for all the matching pairs: 0.33333333, 0.33333333 *) sum up to: 0.66666667 (which is 66.67% of what's possible) The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 3 2 2 60 66.67 100 64.29 6.28 32.5 100 100 0 100 100 100 In the examples above, both best case specificities and chance specificities are given too: 8) Best Case Specificity ("TBst") --------------------------------- The threading specificity, as described above, does not take into account the ranking in the prediction's list of matches, merely the amount of the bet placed on correct hits is used. The best-case specificity, as reported here, is calculated from the rank (position in a list sorted by score) of the best scoring (first) correct hit. If the best-scoring correct hit is listed in first place (has the hightest rank), the best-case specificity is 100%. It is calculated as the number of structures listed in the prediction less the rank of the first correct hit plus one, divided by the number of structures (this number is then multiplied by 100) In a simple example a search database contains 100 structures, the best scoring correct hit is found in position 2 after sorting - this results in a best-case specificity of 99%. If there are, for example, 5 hits with equal score at the top of the list, and only one of them is correct, the best case specificity will be 96% only (as if the first correct hit was found in position 5). If, for example, 3 out of 5 equally scoring hits at the top of the list are correct hits, the best case specificity will be 98% (as if the first correct hit was found at position 3). A threshold is set, with respect to structure-structure comparison scores, below which a "hit" is not considered to be correct; the threshold used for the evaluation was 0.5. Size of Search Set - Rank of first correct hit + 1 Best Case Specificity = 100 * -------------------------------------------------- Size of Search Set In the examples 1, 2,and 3 above, the best case specificity is 100%, because the best scoring match was always a correct match ("1ALA _ 1"). Example 4: ---------- Submission Structure comparison TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.0 NONE _ 0 TSCORE T0021 0 0.3 1ALA _ 1 TSCORE T0021 0 1.0 1ALA _ 1 TSCORE T0021 0 0.6 8ACN _ 1 TSCORE T0021 0 0.0 8ACN _ 1 TSCORE T0021 0 0.1 1GKY _ 1 TSCORE T0021 0 0.0 1GKY _ 1 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1THT A 2 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1DSB A 1 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1PRT F 0 TSCORE T0021 0 0.0 1UBI _ 0 TSCORE T0021 0 0.0 1UBI _ 0 In this example the threading specificity is 30% only, since 30% of the total bet are placed on the correct hit. However, best case specificity amounts to 85.71%, since the best scoring correct hit is found at position 2 in a list of 7 structures (7 - 2 + 1) / 7 = 0.8571. (the dummy structure "NONE" is not evaluated in this case). The evaluator reports: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 3 1 1 30 50 85.71 14.29 7.79 57 100 100 0 100 100 100 9) Chance (Best Case) Specificity ("TChnc") ------------------------------------------- Chance specificity is the best-case specificity expected by chance. This number depends on the size of the search database ("TDbs") and on the number of correct structures found there ("TCmx"). It does not depend on the scores given to structures by the predictors. In a search database with 100 structures that contains 10 correct hits, it is expected that one of them would fall into the top 10% purely by chance - therefore the chance best-case specificity would be 90%! In the examples 1, 2, and 4 above, the chance (best-case) specificify is 14.29%; since there's only one correct structure in the search database of seven proteins, one would expect to find it, purely by chance, in the top 7 hits, the chance specificity is then calculated like the best case specificity above, (7 - 7 + 1) / 7 = 0.1429. In the example 3, there are two correct structures in the search database, one might expect to find one of them in the "top half" of seven hits by chance, the chance specificity is therefore calculated as (7 - 3.5 + 1) / 7 = 0.6429 Size of Search Set Size of Search Set - -------------------------------- + 1 No. of Correct Str. in Search Set Chance Specificity = 100 * ---------------------------------------------------------- Size of Search Set 10) Alignment RMS-deviation ("ARms") ------------------------------------ A virtual "model" for the target structure was constructed from the C-alpha coordinates of a hit's database-structure and the corresponding alignment, as found in the prediction. The root mean square deviation for C-alpha atoms was calculated after superposition of this model onto the actual target structure (coordinate space superposition based on the Calpha coordinates). The RMSD-values are given in Angstroem-units. 11) Number of correctly aligned residues ("ACrct") -------------------------------------------------- "ACrct" is obtained by multplying the length of the prediction's alignment "ALen" by the alignment specificity "ASpc" - i.e. the fraction of correctly aligned residues. It simply gives the number of residues which have been aligned correctly by the prediction. Averaging --------- The alignment-specific numbers found in the first table are weighted averages for all the alignments to correct hits in the submissions. Weighting is based on i) the bet placed on the respective hit in the prediction (normalized after removing the hit to "NONE"), and ii) the structure similarity score of the respective hit as listed in the structure-structure comparison results. For example, the (weighted) average alignment specificity calculates as: sum( ASpc(i) * AWgt(i) * SCWgt(i) ) (i) ASpc = ------------------------------------- sum( AWgt(i) * SCWgt(i) ) (i) where (i) is the set of "correct" hits in the prediction. AWgt(i) is the fraction of the bet (ignoring overall confidence) that was placed on the specific match/alignment. SCWgt(i) is the probability, as defined by the structure comparison method, that these folds are similar (always 1 or 0 for VAST-results). A Note on "Subsets": -------------------- Predictors were given the chance to use subsets of the target sequence for their predictions, these subsets can be specified in the submission format. The same applies to structural subsets, since one might want to parse three-dimensional protein structures into domains, and use (sub)domains in a search database. If two or more sequence subsets are given in the prediction submission, the evaluation software will consequently give two or more separate lines specifying different confidence, if applicable, and the respective evaluation quantities, calculated separately for each sequence subset - which is therefore treated like a separate prediction! Structure-structure comparison searches might have used target structures parsed into separate domains, giving results for the whole chain and for separate domains too, which can be quite different depending on the metric used and the assessment of significance. Furthermore, structure-structure comparison methods might as well use structure-subsets (domains) of the database proteins in their search set. In cases of redundancy in the structure-structure comparison results, it had to be decided which of the respective (sub)results to be used for evaluation purposes. A prediction submission, for example, might have listed a hit of T00xx, sequence subset 2, to a database structure 1ABC X 1. In the structure-structure comparison results one might, for example, find a hit of the whole protein T00xx 0 to 1ABC X 0 - the whole chain of 1ABC X - and a hit of a domain T00xx 1 to a domain 1ABC X 1, as well as a hit of domain T00xx 2 to 1ABC X 2. The parsing of a target sequence/structure into subdomains does not necessarily match between prediction and structure-structure comparison, neither do the domain definitions used for database structures. For the selection of the appropriate structure-structure comparison match, against which a prediction match was evaluated, an algorithm was used which basically tried to maximize alignment coverage; i.e. the fraction of target sequence residues, which have been used for an alignment in both the prediction and the structure-structure comparison. 12) and 13) Alignment Specificity, Alignment Sensitivity ("ASpc", "ASns") ------------------------------------------------------------------------- For each hit to a database structure with a non-zero score, predictors were required to submit a sequence-to-structure alignment, which would be sufficient to construct a three-dimensional model of the backbone atoms for the target range which is covered by the alignment. In general, methods for threading/fold-recognition use such assignments of residues from a target sequence to coordinates (residues) from a database structure, to assess the quality of the respective hit. The results from structure-structure comparison searches of the final target structures against PDB were summarized in the same format, for the evaluation of alignment-specific quantities, alignments were compared to alignments - or alignment-derived data to alignment-derived data. A single prediction might, of course, have hits to several structures in each search set, that are i) listed with a non-zero score, and ii) found to be similar to the target by the structure comparison method, expressed by a non-zero score in the structure comparison result. For all these, the alignment-related quantities are calculated separately, and they are summarized, by taking their weighted average, in the first table the evaluation system reports. Alignment Specificity is the percentage of aligned target sequence residues, as found in the submission, which are aligned correctly, compared with the structure-comparison alignment. Number of correctly aligned residues Alignment specificity = 100 * -------------------------------------------- Number of residues aligned by the prediction Alignment sensitivity is the percentage of aligned target sequence residues as found in the structure-comparison alignment, that have also been correctly aligned in the submission. Number of correctly aligned residues Alignment Specificity = 100 * ------------------------------------------------------ Number of residues aligned by the structure comparison Examples: For simplicity use example no. 1 with respect to the scores, so that only one alignment (T0021 0 <---> 1ALA _ 1) will be evaluated. Alignments are reported in blocks. For simplicity it is assumed that the correct alignment is a single block only. In the first case the target sequence residues 3 to 59 are aligned to the database structure residues 260 to 316; the reference alignment found in the structure comparison result is the same. Submission: TALIGN T0021 0 3 59 1ALA _ 1 260 316 1.0 1 Str.Comp. : TALIGN T0021 0 3 59 1ALA _ 1 260 316 1.0 1 The evaluator reports: $Thread: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.79 57 100 100 0 100 100 100 $Aligns: SCWgt SCLen SCRms SC%id AWgt ALen ARms A%id ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 1ALA 1 1 57 7.79 7.02 1 57 7.79 7.02 57 100 100 0 100 100 100 \_____/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Names+ | | | | | | | | | | | | | | | | | | | | Target -----+ | | | | | | | | | | | | | | | | | | | Subset -------+ | | | | | | | | | | | | | | | | | | Database structure + | | | | | | | | | | | | | | | | | Chain identifier ----+ | | | | | | | | | | | | | | | | Domain identifier -----+ | | | | | | | | | | | | | | | Str.-Str. comparison weight -+ | | | | | | | | | | | | | | Str.-Str. comp. alignment length --+ | | | | | | | | | | | | | Str.-Str. comp. RMS-deviation -----------+ | | | | | | | | | | | | % identical residues in Str. comp. alignment --+ | | | | | | | | | | | Weight (percentage of bet) given by prediction -----+ | | | | | | | | | | Prediction alignment length -----------------------------+ | | | | | | | | | Prediction alignment RMS-deviation ---------------------------+ | | | | | | | | % identical residues in precition alignment -----------------------+ | | | | | | | Number of residues aligned correctly in prediction ----------------------+ | | | | | | Alignment Specificity --------------------------------------------------------+ | | | | | Alignment Sensitivity -------------------------------------------------------------+ | | | | Alignment Mean Shift Error -------------------------------------------------------------+ | | | Alignment Coverage --------------------------------------------------------------------------+ | | Alignment Contact Specificity ---------------------------------------------------------------------+ | Alignment Contact Sensitivity ---------------------------------------------------------------------------+ That is a 100% alignment specificity and -sensitivity. For the second case the alignment as found in the submission is missing some part from the correct block: Submission: TALIGN T0021 0 3 27 1ALA _ 1 260 284 1.0 1 Submission: TALIGN T0021 0 42 59 1ALA _ 1 299 316 1.0 1 Str.Comp. : TALIGN T0021 0 3 59 1ALA _ 1 260 316 1.0 1 The evaluator reports: $Thread: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.23 43 100 75.44 0 100 100 60 $Aligns: SCWgt SCLen SCRms SC%id AWgt ALen ARms A%id ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 1ALA 1 1 57 7.79 7.02 1 43 7.23 6.98 43 100 75.44 0 100 100 60 The alignment sensitivity goes down to 75.44%, since the submission covers 43 of the 57 aligned residues only (but these 43 are aligned correctly). The third case is somewhat worse. The second aligned block of the sub- mission has been "shifted" by one residue with respect to the "correct" alignment, and the correct alignment found in the structure comparison result is now somewhat shorter (at the N-terminus): Submission: TALIGN T0021 0 3 27 1ALA _ 1 260 284 1.0 1 Submission: TALIGN T0021 0 43 60 1ALA _ 1 299 316 1.0 1 Str.Comp. : TALIGN T0021 0 5 59 1ALA _ 1 262 316 1.0 1 The evaluator reports: $Thread: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.15 23 53.49 41.82 0.425 93.02 42.86 25.71 $Aligns: SCWgt SCLen SCRms SC%id AWgt ALen ARms A%id ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 1ALA 1 1 57 7.79 7.02 1 43 7.15 4.65 23 53.49 41.82 0.425 93.02 42.86 25.71 Here only 23 Residues are aligned correctly (the block 5 - 27 <---> 262 - 284), with 43 aligned Residues given in the submission the Specificity goes down to 23/43 -> 53.49%, with 55 aligned residues in the structure comparison, the alignment sensitivity goes down to 23/55 -> 41.82% 14) The Alignment mean shift error ("Shft") ------------------------------------------- The mean shift error is the average distance (in 1-residue units) between residues of the database structure, to which the same target sequence residue has been aligned in the submission and the structure comparison result. It is calculated from a fraction of residues, which are aligned in the submission, only; this fraction (the alignment coverage) is des- cribed below. In the example above, residues 5-27 have been aligned correctly, their contribution to the shift error is 0, residues 3-4 are not found aligned in the respective structure comparison. Residues 43-59 again are found aligned in both Submission and Structure Comparison, however, with an offset of 1 residue! The mean shift error per aligned residues is then computed as: (17 * 1.0 + 23 * 0.0) / (17 + 23) = 17/40 = 0.425 sum(abs(Position Residue i is aligned to in prediction - (i) Position Residue i is aligned to in Structure comp.)) Mean Shift Error = 100 * -------------------------------------------------------------------- No. of residues aligned by prediction AND structure comparison 15) The alignment coverage ("Covr") ----------------------------------- No. of residues aligned by prediction AND structure comparison as well Alignment coverage = 100 * ---------------------------------------------------------------------- No. of residues aligned by the prediction The alignment as found in the submission of the example above includes 43 residues. However, only 40 of these are found aligned to something in the structure comparison results too, the alignment coverage is therefore computed as the ratio 40 / 43 = 93.02% 16) and 17) Alignment contact specificity and -sensitivity ("ACSpc", ACSns") ---------------------------------------------------------------------------- Each correct prediction gives one (or several) alignments between the target sequence and database structures. In principle, a three-dimensional model (at least for the backbone atoms) for the target protein chain could be constructed by simply copying the coordinates of the database structure(s), according to the alignment. The alignment specificity, as described above, is very sensitive to even small mean shift errors. An alignment with a low mean shift error can have an alignment specificity of 0, when none of the residues is aligned correctly, but still a high fraction of contacts can be predicted correctly by the model. Furthermore, some structures have repeated substructures, and an alignment with a high mean shift error and an alignment specificity of 0 can still predict a rather large fraction of contacts correctly, when, for example, the alignment is shifted by one or several of repetitive units in a structure. Consider the following case: Submission: TALIGN T0021 0 3 27 1ALA _ 1 260 284 1.0 1 Submission: TALIGN T0021 0 43 60 1ALA _ 1 299 316 1.0 1 Str.Comp. : TALIGN T0021 0 5 59 1ALA _ 1 262 316 1.0 1 The structure comparison result aligns residues 5-59 from the target sequence to residues 262-316 from the database structure 1ALA. Contacts are defined as two Calpha-Atoms having a distance of less than 8 Å, excluding residues close in sequence (up to 5 residues). In the list below, the contacts as observed in 1ALA are listed in the first two columns, the third column indicates the "predicted" contacts for the target, according to the alignment. [,1] [,2] SComp [1,] 263 303 6-46 [2,] 264 269 7-12 * [3,] 264 303 7-46 [4,] 264 304 7-47 * [5,] 272 277 15-20 * [6,] 273 281 16-24 * [7,] 273 312 16-55 [8,] 273 315 16-58 [9,] 273 316 16-59 * [10,] 274 315 17-58 [11,] 276 281 19-24 * [12,] 277 315 20-58 [13,] 277 316 20-59 * [14,] 281 316 24-59 [15,] 285 295 28-38 [16,] 285 296 28-39 [17,] 287 292 30-35 [18,] 288 293 31-36 [19,] 288 294 31-37 [20,] 288 295 31-38 [21,] 288 296 31-39 [22,] 289 294 32-37 [23,] 289 295 32-38 [24,] 289 296 32-39 [25,] 294 299 37-42 [26,] 296 316 39-59 [27,] 297 313 40-56 [28,] 297 316 40-59 [29,] 300 309 43-52 [30,] 300 312 43-55 [31,] 300 313 43-56 [32,] 301 309 44-52 [33,] 301 313 44-56 * [34,] 304 309 47-52 [35,] 305 310 48-53 * Below the contacts, predicted for the target sequence in the threading submission, are listed in the same way: [,1] [,2] Subm [1,] 263 303 6-47 [2,] 264 269 7-12 * [3,] 264 303 7-47 * [4,] 264 304 7-48 [5,] 272 277 15-20 * [6,] 273 281 16-24 * [7,] 273 312 16-56 [8,] 273 315 16-59 * [9,] 273 316 16-60 [10,] 274 315 17-59 [11,] 276 281 19-24 * [12,] 277 315 20-59 * [13,] 277 316 20-60 [14,] 281 316 24-60 [15,] 300 309 44-53 [16,] 300 312 44-56 * [17,] 300 313 44-57 [18,] 301 309 45-53 [19,] 301 313 45-57 [20,] 304 309 48-53 * [21,] 305 310 49-54 The asterisks indicate predicted contacts found in both the threading submission and the structure comparison. For the calculation of contact specificity, the number of correctly predicted contacts is divided by the number of total predicted contacts, i.e. 9 / 21 = 0.42857 The contact sensitivity is then calculated as the number of correctly predicted contacts divided by the number of actual contacts, as predicted by the structure comparison results, that is, in this case: 9 / 35 = 0.25714 The evaluator reports these as percentages: $Thread: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.15 23 53.49 41.82 0.425 93.02 42.86 25.71 $Aligns: SCWgt SCLen SCRms SC%id AWgt ALen ARms A%id ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 1ALA 1 1 57 7.79 7.02 1 43 7.15 4.65 23 53.49 41.82 0.425 93.02 42.86 25.71 18) Structure comparison alignment weight ("SCWgt") --------------------------------------------------- The score assigned to each hit in structure comparison results is treated as the probability, determined by the respective method, that the target structure and the respective database structure are similar. This quantity, found in structure-structure comparison results for the respective hit, is listed in the alignment-tables. 19) Structure comparison length of alignment ("SCLen") ------------------------------------------------------ gives the number of target sequence residues aligned to database structure residues in the structure-structure comparison results. 20) Structure comparison RMSD ("SCRms") --------------------------------------- Root mean square deviations from structure-structure comparisons were listed in the respective results, these are copied to the alignment evaluation tables. If missing, RMSD-values were calculated from the C-alpha coordinates of a virtual "model", constructed from the database-structure of a hit and the corresponding alignment, and superposition of this model on the actual target structure in coordinate space. These RMSD-values are given in Angstroem-units. 21) Percentage of identical residues in structure comparison alignment ("SC%id") -------------------------------------------------------------------------------- This number gives the fraction of residues from the target sequence, which have been aligned to identical residue types by the structure comparison method. The fraction is expressed as a percentage. 22) Prediction alignment weight ("AWgt") ---------------------------------------- A threading/fold-recognition prediction has assigned fractions of the total bet to distinct hits from its list of database structures. The number reported as "AWgt" is the fraction of this bet, placed on the hit corresponding to the alignment evaluated. This fraction is calculated without taking into account the bet placed on the dummy structure "NONE". 23) Prediction alignment length ("ALen") ---------------------------------------- gives the number of target sequence residues aligned to database structure residues in the structure-structure comparison results. 24) Percentage of identical residues in prediction alignment ("A%id") --------------------------------------------------------------------- As for the structure-structure comparison alignment, this number gives the fraction of residues from the target sequence, which have been aligned to identical residue types by the prediction; the fraction is again expressed as a percentage. Probabilistic Alignments ------------------------ The submission format permitted the use of probabilistic alignments; instead of specifying just one way for the target sequence to be aligned to a database structure, several different alignments could be specified, along with different probabilities that were required to sum up to 1. In the evaluation probabilistic alignments are converted into matrices (target-sequence residues x database- structure residues), that hold the alignment information as "weights" being associated with the assignment of particular target-sequence-residues to particular database-structure-residues. If a target sequence residue is aligned to some position in all the probabilistic alignments given, the respective row in the alignment-matrix will sum up to 1.0; however, different probabilistic alignments can vary in length an in the regions they cover, so they do not in general provide a sum-weight of 1.0 for each of the aligned residues! For the calculation of alignment specificity and sensitivity, the elements of the alignment-matrix are treated as individual weights assigned to the respective sequence-residue - structure-position matches. Alignment specificity is then calculated as the sum over all the weigths assigned to correctly aligned positions (as compared with the structure comparison), divided by the sum over all the weights assigned. Similarly the alignment sensitivity is calculated as the weight assigned to correctly aligned positions divided by the sum over all the weights in the structure-comparison alignment matrix. Example: -------- The following example specifies two different alignments: REMARK -------------------------------------------------------- Submission: TALIGN T0021 0 3 27 1ALA _ 1 260 284 0.55 1 Submission: TALIGN T0021 0 43 60 1ALA _ 1 299 316 0.55 1 Submission: TALIGN T0021 0 2 26 1ALA _ 1 260 284 0.45 2 Submission: TALIGN T0021 0 40 59 1ALA _ 1 297 316 0.45 2 REMARK -------------------------------------------------------- SComparison: TALIGN T0021 0 5 59 1ALA _ 1 262 316 1.0 1 REMARK -------------------------------------------------------- The Evaluator reports: $Thread: Conf TDbs TNt0 TCrct TCmx TSpc TSns TBst TChnc ARms ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 100 7 1 1 1 100 100 100 14.29 7.15 21.7 49.32 39.36 0.448 93.17 40.73 24.04 $Aligns: SCWgt SCLen SCRms SC%id AWgt ALen ARms A%id ACrct ASpc ASns Shft Covr ACSpc ACSns Sub SCo T0021 0 1ALA 1 1 57 7.79 7.02 1 44 7.15 6.82 21.7 49.32 39.36 0.448 93.17 40.73 24.04 The alignments RMS-deviation ("ARms") is calculated for one of the several probabilistic alignments only, the one with the highest probability (0.55, and alignment id of 1 in this case) is selected. "ALen" and "A%id" are (unweighted) averages over all the alignments in this case. In the first alignment, 43 Residues are aligned (3-27 and 43-60) with a "weigth" of 0.55, in the second alignment 45 Residues are aligned (2-26 and 40-59) with a weight of 0.45, this sums up to a total weight of 43*0.55 + 45*0.45 = 43.9 placed on the alignment matrix. Of the first alignment, 23 residues are aligned correctly (5-27), compared with the structure comparison alignment, of the second alignment 20 residues are correctly aligned (40-59), the weight assigned to correct matrix elements is therefore 23*0.55 + 20*0.45 = 21.65 ("ACrct") The alignment specificity is calculated as 21.65 / 43.9 = 0.4931 The total weight placed on the alignment matrix of the structure comparison alignment is 55 (residues 5-59, probability 1.0), The alignment sensitivity is calculated as 21.65 / 55 = 0.3936 For the calculation of the mean shift error, the weight given to individual components of the alignment matrix must be taken into account too. The mean shift error for probabilistic alignments is given as the mean displacement from the ideal, for all aligned residues that are aligned too in the structure comparison results, scaled by the weigths given to them. In the example, the correctly aligned portions 5-27 <--> 262-284 (Alignment 1) and 40-59 <--> 297-316 (Alignment 2) do not contribute to the mean shift error. However, segments 43-59 <--> 299-315 (Alignment 1) and 5-26 <--> 263-284 (Align- ment 2) each have an offset-error of 1 residue, therefore they contribute (residues * Weight * Offset-Error): 17 * 0.55 * 1 + 22 * 0.45 * 1 = 19.25 In total, aligned residues were taken into consideration between residue numbers 5 to 27 and 40 to 59, that is 43 residues in total. Everything outside this range is either not aligned in the structure-comparison alignment or nowhere aligned in the threading submission alignments. The mean shift error is therefore calculated as 19.25 / 43 = 0.4477 Residues! The Alignment Coverage, is calculated as the fraction of the total weight from the alignment-matrix given to target-sequence positions that were aligned in the Structure-Comparison alignment too. In the example here, the total weight placed on the alignment matrix was 43.9, as calculated above. The weight placed on residues aligned in the structure comparison results is for segments 5-27 and 43-59 (Alignment 1): (23+17) * 0.55, for segments 5-26 and 40-59 (Alignment 2): (22+20) * 0.45, that is 40 * 0.55 + 42 * 0.45 = 40.9 The Alignment Coverage is calculated as 40.9 / 43.9 = 0.9317 For the calculation of contact-specific parameters, a matrix of predicted contacts is constructed for the target sequence (and a matrix of observed contacts, as derived from the structure comparison results). The elements of the contact- matrix are the sums over the products between elements from the respective rows of the alignment matrices, whenever they are found to be aligned to positions making contacts in the database structure. In the example below, the first column gives the observed contacts in 1ALA, which lie in the range used in the example threading submission. The second column lists the contacts the structure comparison method has predicted for the target sequence. The 3rd-6th columns list the contacts as predicted from the alignment matrix, where A1 and A2 are the indices with respect to the two different alignments the sequence-structure matching refers to (the associated probabilities were 0.55 for Alignment 1 and 0.45 for Alignment 2). Asterisks indicate, whether the predicted contact is correct. The weights used below are derived from the products of the individual aligmnent probabilities. Contacts with less than 5 residues sequential separation are ignored. 1ALA SComp A1-A1 A2-A2 A1-A2 A2-A1 ================================================== [1,] 263-303 6-46 6-47 5-46 6-46* 5-47 [2,] 264-269 7-12 7-12* 6-11 (7-11) 6-12 [3,] 264-303 7-46 7-47* 6-46* 7-46* 6-47 [4,] 264-304 7-47 7-48 6-47 7-47* 6-48 [5,] 272-277 15-20 15-20* 14-19 (15-19) 14-20 [6,] 273-281 16-24 16-24* 15-23 16-23 15-24 [7,] 273-312 16-55 16-56 15-55 16-55* 15-56 [8,] 273-315 16-58 16-59* 15-58 16-58* 15-59 [9,] 273-316 16-59 16-60 15-59 16-59* 15-60 [10,] 274-315 17-58 17-59 16-58* 17-58* 16-59* [11,] 276-281 19-24 19-24* 18-23 (19-23) 18-24 [12,] 277-315 20-58 20-59* 19-58 20-58* 19-59 [13,] 277-316 20-59 20-60 19-59 20-59* 19-60 [14,] 281-316 24-59 24-60 23-59 24-59* 23-60 [15,] 285-295 28-38 ----- ----- ----- ----- [16,] 285-296 28-39 ----- ----- ----- ----- [17,] 287-292 30-35 ----- ----- ----- ----- [18,] 288-293 31-36 ----- ----- ----- ----- [19,] 288-294 31-37 ----- ----- ----- ----- [20,] 288-295 31-38 ----- ----- ----- ----- [21,] 288-296 31-39 ----- ----- ----- ----- [22,] 289-294 32-37 ----- ----- ----- ----- [23,] 289-295 32-38 ----- ----- ----- ----- [24,] 289-296 32-39 ----- ----- ----- ----- [25,] 294-299 37-42 ---43 ---42 ---42 ---43 [26,] 296-316 39-59 ---60 ---59 ---59 ---60 [27,] 297-313 40-56 ---57 40-56* ---56 40-57 [28,] 297-316 40-59 ---60 40-59* ---59 40-60 [29,] 300-309 43-52 44-53 43-52* 44-52* 43-53 [30,] 300-312 43-55 44-56* 43-55* 44-55 43-56* [31,] 300-313 43-56 44-57 43-56* 44-56* 43-57 [32,] 301-309 44-52 45-53 44-52* 45-52 44-53 [33,] 301-313 44-56 45-57 44-56* 45-56 44-57 [34,] 304-309 47-52 48-53* 47-52*(48-52) 47-53 [35,] 305-310 48-53 49-54 48-53*(49-53) 48-54 ================================================== no.contacts 35 21 23 16 23 correct 35 9 11 12 2 weight 1 0.3025 0.2025 0.2475 0.2475 Predicted Contacts : 21 * 0.3025 + 23 * 0.2025 + (16+23) * 0.2475 = 20.663 Correct predictions: 9 * 0.3025 + 11 * 0.2025 + (12+ 2) * 0.2475 = 8.415 The total weight put on correctly predicted contacts divided by the total weight put on any valid predicted contacts gives the contact specificity, the weight put on correctly predicted contacs divided by the weigth put on all contacts in the structre comparison results gives the contact sensitivity. Contact Specificity: 8.415 / 20.663 = 0.4073 Contact Sensitivity: 8.415 / 35 = 0.2404 (i.e. 40.73% of the contacts predicted by the probabilistic alignments were correct, but only 24.04% of the actual contacts were predicted correctly).