Description
This track shows where non-protein coding RNA genes and
pseudo-genes are located. This data was kindly provided by
Sean Eddy at Washington University.
Feature types:
- tRNA: Transfer RNA (or pseudogene)
- rRNA: Ribosomal RNA (or pseudogene)
- scRNA: Small cytoplasmic RNA (or pseudogene)
- snRNA: Small nuclear RNA (or pseudogene)
- snoRNA: Small nucleolar RNA (or pseudogene)
- miRNA: MicroRNA (or pseudogene)
- misc_RNA: miscellaneous other RNA, such as Xist (or pseudogene)
Methods
Eddy-tRNAscanSE (tRNA genes, Sean Eddy)
tRNAscan-SE 1.23 w/ default parameters.
Score field contains tRNAscan-SE bit score; >20 is good, >50 is great.
Eddy-BLAST-tRNAlib (tRNA pseudogenes, Sean Eddy)
WUBLAST 2.0, w/ options "-kap wordmask=seg B=50000 W=8 cpus=1".
Score field contains % identity in BLAST-aligned region.
Used each of 602 tRNAs and pseudogenes predicted by tRNAscan-SE
in the human oo27 assembly as queries; kept all nonoverlapping
regions that hit one or more of these w/ P <= 0.001.
Eddy-BLAST-snornalib (known snoRNAs and snoRNA pseudogenes, Steve Johnson)
WUBLASTN 2.0, w/ options "-V=25 -hspmax=5000 -kap wordmask=seg
B=5000 W=8 cpus=1".
Score field contains BLAST score.
Used each of 104 unique snoRNAs in snorna.lib as a query.
Any hit >=95% full length and >=90% identity is annotated as a
"true gene".
Any other hit with P <= 0.001 is annotated as a "related sequence",
and interpreted as a putative pseudogene.
Eddy-BLAST-otherrnalib
(non-tRNA, non-snoRNA noncoding RNAs with Genbank entries
for the human gene.)
WUBLASTN 2.0 [15 Apr 2002]
w/ options: "-kap -cpus=1 -wordmask=seg -W=8 -E=0.01 -hspmax=0
-B=50000 -Z=3000000000"
Exceptions:
- Large ncRNAs: (LSU & SSU rRNA, H19, Xist)
change "-W=11"; addition "-maskextra=50"
Xist contains repetitive elements and was masked with
RepeatMasker, Library version 6.8.
- microRNAs:
"-kap -cpus=1 -S=70 -hspmax=0 -B=100" replaces all
above parameters.
Score field contains BLASTN score.
Used 41 unique miRNA, and 29 other ncRNAs as queries.
Any hit >=95% full length and >=95% identity is annotated as a
"true gene".
Any other hit with P <= 0.001 and >= 65% identity is annotated
as a "related sequence".
Exceptions: all miRNAs consist of 16-26bp sequences in Genbank
and are only annotated if 100% full length and 100% identity.
miRNAs consist of Let-7 from Pasquinelli et al.,
Nature (2000) 408:86; 40 from Mourelatos et al., Gene & Dev (2002)
16:720.