The Known Genes track shows known protein coding genes based on proteins from SWISS-PROT, TrEMBL, and TrEMBL-NEW and their corresponding mRNAs from Genbank. Coding exons are displayed taller than 5' and 3' untranslated regions (UTR). Connecting introns are one-pixel lines with hatch marks indicating direction of transcription. Entries which have corresponding entries in PDB are colored black. Entries which either have corresponding proteins in SWISS-PROT or mRNAs that are NCBI Reference Sequences with a "Reviewed" status are colored dark blue. Entries which have mRNAs that are NCBI Reference Sequences with a "Provisional" status are colored lighter blue. Everything else is colored with lightest blue.
All mRNAs of a species are aligned against the genome using the blat program. When a single mRNA aligns in multiple places, only the best alignments are kept. The alignments must also have at least 98% sequence identity to be kept. This set of mRNA alignments is further reduced by keeping only those mRNAs that are referenced by a protein in SWISS-PROT, TrEMBL, or TrEMBL-NEW.
Among multiple mRNAs referenced by a single protein, the best mRNA is chosen based on a quality score, which depends on its length, how good its translation matches the protein sequence, and its release date. The list of mRNA and protein pairs are further cleaned up by removing short invalid entries and consolidating entries with identical CDS regions.
Finally, RefSeq entries which are derived from DNA sequences instead of mRNA sequences are added. Disease annotations are from SWISS-PROT.
The Known Genes track is produced at UCSC based primarily on cross-references between proteins from SWISS-PROT (also including TrEMBL and TrEMBL-NEW) and mRNAs from Genbank generated by scientists worldwide. Part of NCBI RefSeq data are also included in this track.
The SWISS-PROT entries in this annotation track are copyrighted. They are produced through a collaboration between the Swiss Institute of Bioinformatics and the EMBL Outstation - the European Bioinformatics Institute. There are no restrictions on their use by non-profit institutions as long as their content is in no way modified and this statement is not removed. Usage by and for commercial entities requires a license agreement (see http://www.isb-sib.ch/announce/ or send an email to license@isb-sib.ch).