With the completion of the genome sequence of the nematode Caenorhabditis elegans [C. elegans Sequencing Consortium, 1998], we have the opportunity to study the first complete transfer RNA (tRNA) collection from a complex, multicellular eukaryote. Two studies have been published describing the complete collection of tRNAs within the single-celled eukaryote S. cerevisiae [Percudani et al., 1997,Hani & Feldmann, 1998]. Those analyses confirmed early theoretical predictions on the minimal complement of tRNA genes needed by a eukaryote [Guthrie & Abelson, 1982], as well as giving evidence for the relationship between tRNA species copy number, intracellular tRNA concentration, and protein codon usage.
tRNAs are a critical link in the fidelity of information transfer from messenger RNA (mRNA) to protein sequence. Accurate incorporation of amino acids during translation depends on correct ``reading'' of the genetic code specified by three-base codons [Crick, 1966]. Because there are only 20 amino acids, it is obvious there is excess coding capacity among the 64 possible permutations of the triplet code. Naively, one might expect to find only a subset of the possible codon combinations in use by a particular organism to simplify the cellular machinery needed to translate all possible proteins. In fact, organisms contain all possible codon combinations within (or terminating) their mRNA sequences. Thus, most amino acids are coded for by more than one synonymous codon triplet. If one tRNA were required for each possible codon, this would require the cell to maintain over 60 different tRNA species to be able to translate all possible codons. In fact, many tRNAs specifically recognize more than one codon through non-Watson-Crick base pairings, commonly known as the ``wobble hypothesis'' [Crick, 1966].
Crick initially proposed the wobble rules based on the observation that codons specifying the same amino acid commonly share the same first two nucleotides. He guessed that the first position of the tRNA anticodon could base pair with more than one possible nucleotide in the third position of mRNA codons. Specifically, a third position anticodon G could pair with U or C, a U with A or G, and an I (inosine) with U, C, or A (it had been shown that genomically encoded adenosines in the first position of tRNA anticodons are almost universally deaminated to inosine). These simple rules, summarized in Table 3.1, can account for the reduced complement of tRNAs needed for normal translation. In 1982, Guthrie and Abelson [Guthrie & Abelson, 1982] updated and revised the wobble rules, based on observations of characterized yeast tRNAs and their anticodon modifications. They predicted that 46 different tRNA species would be found in yeast (28 were known at the time), and perhaps in all eukaryotes.
Transfer RNAs are the most extensively modified RNAs studied to date. Some modifications are involved in allowing accurate aminoacylation and/or assumption of the native conformation, but those at position 34 (the first anticodon position) either expand base pairing ability (for example, A to I modification), or restrict pairing ability (various modifications of U). Only one tRNA has been directly RNA sequenced in C. elegans, Leu-AAG, which was found to contain an I at position 34 [Tranquilla et al., 1982]. Numerous C. elegans tRNAs appear in the Sprinzl tRNA database [Steinberg et al., 1993], although they are all derived from DNA sequences, devoid of modification information.
Codon selection has been observed to be non-uniform, depending on organism, genomic location, and transcription level of the coded gene. These are seen as the result of a balance between mutational bias and selection for translationally optimal codons [Sharp et al., 1993]. Highly transcribed genes such as ribosomal proteins and structural proteins tend to have the most non-random, biased codon selection, whereas lowly expressed genes such as regulatory factors tend to have fairly unbiased codon selection. In Escherichia coli and S. cerevisiae, highly expressed genes bias towards codons decoded by the most abundant tRNA species [Bennetzen & Hall, 1982]. Furthermore, a positive correlation has been observed between the cellular abundance of yeast tRNAs and overall codon frequency [Ikemura, 1982]. This is likely an instance of co-adaptation in which both codon selection and intracellular tRNA concentration change to reach an optimal balance.
Intracellular tRNA levels are controlled by several possible factors: gene copy number, individual transcription rates, and post-transcriptional regulatory mechanisms. In prokaryotes, pressure for genome compactness appears to severely limit the number of redundant tRNA gene copies (i.e., Haemophilus influenzae has 58 tRNAs [Lowe & Eddy, 1997], E. coli has 86 [Blattner et al., 1997]). Thus, the latter two factors are the most likely determinants of tRNA concentrations [Dong et al., 1996]. In yeast, tRNA copy number varies greatly between 1 and 16 depending on the tRNA species, yielding 274 total genes assorted among 42 unique tRNA classes [Percudani et al., 1997,Hani & Feldmann, 1998]. A strong correlation between gene copy number, intracellular tRNA level, and overall codon preference has been observed [Percudani et al., 1997,Hani & Feldmann, 1998]. These studies confirmed that tRNA levels are primarily influenced by gene copy number in yeast, and that relative tRNA levels may be predicted based on codon preference within highly expressed genes.
The genome of C. elegans is approximately eight times larger than that of S. cerevisiae, with 27% versus 72% of the genome coding for exons within worms and yeast, respectively. Thus, C. elegans is under less evolutionary pressure to maintain a compact genome and may also use tRNA gene copy as a strategy for regulating intracellular tRNA concentration. In contrast to yeast, however, C. elegans is a complex, multicellular eukaryote with many tissue types, no doubt requiring some degree of tissue-specific regulation of tRNA levels. The internal RNA polymerase III promoter sequences for tRNAs, the A and B boxes, do not change between redundant tRNA copies, although upstream and downstream sequences are not conserved and have been shown to modulate eukaryotic tRNA gene expression [Wilson et al., 1985,Young et al., 1986,Reynolds, 1995]. It is unclear to what extent these external enhancer elements are responsible for controlling tRNA concentration. Thus, tRNA copy number may not be predictive of tRNA levels in multicellular eukaryotes like C. elegans.
In this study, we analyze the complete complement of tRNAs within C. elegans to answer three main questions. First, do the predicted tRNAs fulfill the minimal 46 classes believed to be necessary for translation of all possible codons? Does tRNA copy number correspond to biased codons within the most highly expressed genes, thus implying that gene copy is a major determinant of intracellular tRNA concentration? And finally, we examine many apparent tRNA pseudogenes to find several possible examples of novel SINE elements [Daniels & Deininger, 1985,Deininger, 1989], the first repetitive elements of this class described in C. elegans.