With the large collection of sRNA genes identified within the Pyrococcal genomes, we were able to make a reasonable assessment of the their locations in the three species relative to protein coding genes and to each other. The sRNA genes are in general dispersed throughout the genome, and in only four cases are two genes located near each other. Three of the pairs are found near each other in all three genomes: sR14-sR22, sR2-sR9, sR12-sR34. All are on opposite strands and oriented away from each other. The distances between pairs ranged from a single nucleotide (Pho sR12-sR34) to 130 nt (Pfu sR2-sR9). The fourth pair, sR50-sR54, found only in P. furiosus, is oriented on the same strand and separated by 34 nucleotides. This is the only possible candidate for a polycistronic sRNA transcript.
Examination of the positions of sRNA loci relative to protein coding regions resulted in a unique finding: some (20-35%) sRNA genes appear to overlap partially with either the 5' or the 3' ends of open reading frames (ORFs) on the coding strand. Of the 17 overlaps in the P. horikoshii genome, eight occur at the 5' ends of protein ORFs. Based on BLAST results, all of these are likely to be artifacts resulting from incorrect assignment of translation initiation codons. In the nine cases where the overlap occurs at the 3' end of the protein ORFs, the overlaps appear to be valid. In most of these cases, the translation stop codons are provided by either the C or the D' box of the overlapping sRNA. A few sRNAs appear to partially overlap coding regions on the opposite strand, but we found no cases of sRNAs completely within predicted protein ORFs. Almost all sRNAs that do not overlap protein coding regions were located very near ORF boundaries (5-20 nt) and are probably too near to have their own promoters. Thus, they may be co-transcribed with upstream protein encoding genes and processed out of polycistronic transcripts.
We observed only one case where an sRNA was encoded completely within another gene: the Pyrococcal sR40 family resides as an intron in the anticodon loop of the gene encoding the tRNA-Trp. This intron, which exhibits all of the hallmark features of an Archaeal sRNA, has been independently identified by Daniels and coworkers (personal communication). They present evidence that the D' and D box guides target methylation to positions C42 and C37 within the intron-containing precursor tRNA. We recovered this sRNA in our search because the respective guides appear to be capable of targeting methylation to C1252 in 16S rRNA and C1171 in 23S rRNA. Neither the tRNA nor rRNA target predictions have been experimentally verified.