# Sat Sep 1 14:15:28 PDT 2007 Kevin Karplus # WORK IN PROGRESS: NOT READY FOR USE # The str4 alphabet is a new secondary structure alphabet that can be # computed by undertaker. It is based in part on the successful str2 # alphabet and in part by the n_notor2 alphabet. # # There are 3 sections to the alphabet: # helix letters, based primarily on hydrogen-bonding patterns # strand letters, based on hydrogen bonding and NOtor torsion angles # other letters, based mainly on phi-psi angles. # Helix letters are assigned first. Any residue that gets a helix # letter cannot be subsequently relabeled. # If there is an Hbond Oi-2 to Ni+3, with NOtor angle <-133 or >76, # then assign letter I (pi helix). # If there is also a normal helical Oi-2,Ni+2, then assign letter J. #` # If there is an Hbond Oi-2 to Ni+1, then assign letter G (3-10 helix or turn). # If there is ALSO Oi-2 to Ni+2, then assign letter F instead. # The combined 3-10 and alpha hbonds are most commonly found at # the C-terminal end of a helix. # # If there is an Hbond Oi-2 to Ni+2, then assign letter H (helix). # # Note: F,H,J all mean that there is an Oi-2,Ni+2 Hbond. # Every separation=4 Hbond results in exactly one F,H, or J being labeled. # # There is some non-reversability in this alphabet, as HJ is # common, but JH is never seen. # # Note that all the helix leter definitions are based on what # happens at Oi-2. Other definitions were considered, but this # seemed to provide the cleanest assignments with common # multiple H-bond possibilities. # # Strand letters are assigned next, based solely on Hbond patterns. # Strands are separated by parallel, antiparallel, mixed # edge strands are further separated by bonded/unbonded # antiparallel bonded residues are further separated by # NOtor angle # # [Sun Oct 7 2007. # Removed the torsion angle constraints from the parallel # strands. Also added in the multiple hydrogen bonds for # strands. ] # # For an H-bond to be considered part of a parallel strand, # the sheet-partner of i is j-1, and the # sheet-partner of j is i+1. # For an Hbond to be consider part of an antiparallel strand, # the sheet-partner of i is j and of j is i. # # If there are multiple Hbonds on the acceptor atom, sidechain # Hbonds are ignored, and the ratio of the probabilities # of the hydrogen bonds are checked. The log of the ratios # is taken and if the |logratio| <= 5 and the hydrogen bonds # are from HO_i to HN_j and HN_j+1 they are considered # a multiple strand, otherwise the strongest NO hbond is used # for classification. # # Let's refer to the Hbonds of residue i as HN_i and HO_i, # if they exist. # # When assigning different letters based on NOtor angle, the # threshold is set at <=-17 degrees. If both HN_i and # HO_i exist, the one with the larger separation is used. # Since the bonding partners are usually the same # residue for both in an antiparallel strand, we use # HN_i if the bonding partner j>i and HO_i if the bonding # partner j=2.955) # and short hydrogen bonds (<2.955). ######################################## # We end up with 5 helix labels, 8 strand labels, and 5 loop labels, # for a total of 20 letters in the alphabet. ######################################## ClassName = Alphabet Name = str5 IsNucleic = 0 NormalChars = ABCDEFGHIJKLMPQRWYZ AllMatch = X # Helix types CharName = G 3_10 3_10_helix_or_turn CharName = H helix_long alpha_helix_long_hydrogen_bonds CharName = I helix_short alpha_helix_short_hydrogen_bonds CharName = J pi pi_helix CharName = K other_helical other_helical # strand types CharName = A anti_middle antiparallel_middle_strand CharName = Y anti_edge_bonded antiparallel_edge_bonded CharName = Z anti_edge_un antiparallel_edge_strand_unbonded CharName = L mixed_anti mixed_middle_strand_anti CharName = M mixed_parallel mixed_middle_strand_parallel CharName = P parallel parallel_middle_strand CharName = Q para_edge_bonded parallel_edge_strand_bonded CharName = R para_edge_un parallel_edge_strand_unbonded # other types, based on NO_dist2 and r_rot CharName = B NOdist_low_rot_r_low NOdist_low_rot_r_low CharName = C rot_r_mid rot_r_mid CharName = D NOdist_low_rot_r_high NOdist_low_rot_r_high CharName = E NOdist_high_rot_r_low NOdist_high_rot_r_low CharName = F NOdist_high_rot_r_high NOdist_high_rot_r_high # other types, bifurcated Hbond (|log(ratio)| of hbond probabilities # < 5 and seperation between the two residues is one.) CharName = W bifurcated bifurcated_multiple_hbonds # If none of the Hbond definitions assign a code, and the residue does # not have an alpha torsion angle, assign X. EndClassName = Alphabet