tRNAscan-SE continues to be a commonly used tool for genome analysis and has now become a standard tool for analysis of newly completed genomes at the major genome sequencing centers. In Chapter 3, I use the program to search the C. elegans genome to identify and analyze the first complete tRNA family from a multicellular eukaryote. A classic prediction [Guthrie & Abelson, 1982] regarding tRNA species representation and application of the ``wobble rule'' to eukaryotes was confirmed. A correlation between tRNA genome copy number and intracellular tRNA levels was supported. And finally, over 200 tRNA-like pseudogenes were identified and classified, including the first example of a high-copy number SINE-like repetitive element in C. elegans.
Creation and application of tools like tRNAscan-SE are of value to the scientific community, although tRNA research has reached a mature, linear growth phase. tRNAs have been intensively studied for at least three decades, thus few ``unexpected'' biological findings resulted from this project. Upon completion of the tRNA work, I became interested in an active, recently rejuvenated area of RNA research, the small nucleolar RNAs. A landmark study had recently been published [Kiss-Laszlo et al., 1996] showing the link between one type of snoRNA gene and placement of ribose methylations within rRNA. The study also implied that dozens of snoRNAs were yet to be discovered in both yeast and mammals. The same study gave a detailed profile of snoRNA sequence characteristics which could be used to train a probabilistic search program.
For reasons already discussed, covariance models are not able to model snoRNAs adequately. Instead, I created a new, specialized program employing probabilistic scoring methods, tailored specifically to snoRNA gene features (Chapter 4). My goal was to identify all snoRNAs of this type in the recently completed yeast genome. The project had a definable goal of associating at least one snoRNA with each of 55 ribose methylation sites in yeast rRNA. Once implemented, I carried out multiple rounds of snoRNA gene prediction, experimental gene disruption, and assay for loss of the linked ribose methylation. As newly identified snoRNAs were proven experimentally, I incorporated them into my training data, thus improving search sensitivity and selectivity for subsequent rounds of prediction. In the end, I was able to identify and verify 22 new snoRNA genes, and assign snoRNAs to 51 of the 55 methylation sites [Lowe & Eddy, 1999]. Combining a new theoretical method with unambiguous experimental verification was key to success of the project.
The snoRNA search program was then modified and applied to seven archaeal genomes, resulting in the identification of over 200 new snoRNA genes in the first report of snoRNAs in the domain Archaea (Chapter 5). The work was made possible by a collaboration with an experimental lab which provided a ``seed'' alignment of 18 experimentally verified archaeal snoRNAs. Again, the combination of theoretical and experimental methods produced results surpassing what either method could have achieved independently.
In the following sections, I review the two research areas that formed the foundation of the methylation guide snoRNA work described in Chapters 4 & 5.