An alternate method was needed to find sno-like RNAs in the other sequenced archaeal genomes. Because I have had previous success with a specialized snoRNA gene finding program [Lowe & Eddy, 1999], I decided to tailor a search for archaeal snoRNAs. The original program used a probabilistic model trained on known human and yeast snoRNAs. With the new set of verified S. acidocaldarius sRNA genes cloned by the Dennis lab, I retrained the program for archaeal sRNAs. The search algorithm and general model remained as originally described [Lowe & Eddy, 1999]. Alignments of the box features (C, D, C', D') of the S. acidocaldarius sRNAs were used to create log odds weight matrices reflecting the frequency of each nucleotide at each position in each box feature. The lengths of the rRNA complementary region and the gaps between box features were scored with binned length distributions. Overall, training data for the nucleotide content of the box features did not change significantly, but the distribution of lengths between features did vary; archaeal sRNAs appear to be much more compact than those in eukaryotes, and the rRNA complementary regions are shorter (commonly 8-11 nt long, compared to 12-14 in S. cerevisiae).
I started the sRNA genome searches in Sulfolobus solfataricus, for which approximately half the genome sequence was available. The program identified many dozens of sRNA candidates, each of which had the potential to target a modification to a particular position in the ribosomal RNA of S. solfataricus. Because I had very little a priori knowledge of verified ribose methylation sites in S. solfataricus rRNA, I sorted all candidates by overall score, regardless of the target rRNA methylation site. I designed primers against the top twenty sRNA candidates, and performed primer extensions on total S. solfataricus RNA to identify sRNA transcripts of the correct length. Based on cloned sRNAs, I assumed that new sRNAs should have a 5' end 2-6 nt upstream from the predicted C box. Ten of the top 13 scoring candidates produced primer extension products of the approximate size (data not shown). An alignment of the 10 verified S. solfataricus sRNAs and 3 other predictions is shown in Figure 5.4 (below S. acidocaldarius clones for comparison). For seven sRNA candidates, we also attempted to verify a predicted target ribose methylation site, again using the dNTP concentration-dependent primer extension assay. Sites for four sRNAs were verified (see Table 5.2). Because we have evidence for rRNA methylation sites corresponding to a number of verified or predicted sRNAs, we believe that as in eukaryotes, C/D box sRNAs function as a guides for methylation.