The April 2003 human reference sequence (UCSC version hg15) is based on NCBI Build 33. It has been released in conjunction with the International Human Genome Sequencing Consortium's announcement of the successful completion of the Human Genome Project. This reference sequence covers about 99 percent of the human genome's gene-containing regions, and has been sequenced to an accuracy of 99.99 percent. The missing portions are essentially contained in less than 400 defined gaps that represent DNA regions with unusual structures that can't be reliably sequenced using current technology. The average DNA letter now lies within a stretch of approximately 27,332,000 base pairs of uninterrupted sequence!
Chromosomal sequences for this release were assembled by the International Human Genome Sequencing Consortium sequencing centers and verified by NCBI and UCSC. In some cases, sequence joins between adjacent clones could not be computationally validated because the clones originated from different haplotypes and contained polymorphisms in the overlapping sequence, or the overlap was too small to be to be reliable. In these instances, the sequencing center responsible for the particular chromosome has provided data to support the join in the form of an electronic certificate. These certificates may be reviewed through the link below.
Bulk downloads of the sequence and annotation data are available via the Genome Browser FTP server or the Downloads page. The hg15 annotation tracks were generated by UCSC and collaborators worldwide. See the Credits page for a detailed list of the organizations and individuals who contributed to the success of this release.
A genome position can be specified by the accession number of a sequenced genomic clone, an mRNA or EST or STS marker, or a cytological band, a chromosomal coordinate range, or keywords from the GenBank description of an mRNA. The following list shows examples of valid position queries for the human genome. See the User's Guide for more information.
Request: |
Genome Browser Response: |
|
chr7 | Displays all of chromosome 7 | |
20p13 | Displays region for band p13 on chr 20 | |
chr3:1-1000000 | Displays first million bases of chr 3, counting from p arm telomere | |
scf1:1-1000000 | Displays first million bases of scaffold 1 of an unmapped genome assembly | |
D16S3046 | Displays region around STS marker D16S3046 from the Genethon/Marshfield maps. Includes 100,000 bases on each side as well. | |
RH18061;RH80175 | Displays region between STS markers RH18061;RH80175. Includes 100,000 bases on each side as well. | |
AA205474 | Displays region of EST with GenBank accession AA205474 in BRCA1 cancer gene on chr 17 | |
AC008101 | Displays region of clone with GenBank accession AC008101 | |
AF083811 | Displays region of mRNA with GenBank accession number AF083811 | |
PRNP | Displays region of genome with HUGO identifier PRNP | |
NM_017414 | Displays the region of genome with RefSeq identifier NM_017414 | |
NP_059110 | Displays the region of genome with protein accession number NP_059110 | |
11274 | Displays the region of genome with LocusLink identifier 11274 | |
pseudogene mRNA | Lists transcribed pseudogenes, but not cDNAs | |
homeobox caudal | Lists mRNAs for caudal homeobox genes | |
zinc finger | Lists many zinc finger mRNAs | |
kruppel zinc finger | Lists only kruppel-like zinc fingers | |
huntington | Lists candidate genes associated with Huntington's disease | |
zahler | Lists mRNAs deposited by scientist named Zahler | |
Evans,J.E. | Lists mRNAs deposited by co-author J.E. Evans | |
Use this last format for author queries. Although GenBank requires the search format Evans JE, internally it uses the format Evans,J.E.. |