Description

This track displays transcriptome data from tiling GeneChips produced by Affymetrix. For the ten Chromosomes 6, 7, 13, 14, 19, 20, 21, 22, X, and Y, more than 74 million probes were tiled every 5 bp in non-repeat-masked areas and hybridized to mRNA from the SK-N-AS cell line. These data are a preview of the Phase Two of the transcriptome project which will include data from 7 additional cell lines when completed. While the coverage of the genome is much larger and the probe density greater, the general method is similar to the Phase One project carried out on chromosomes 21 and 22 (Kapranov, P., Cawley, S. E., Drenkow, J., Bekiranov, S, Strausberg, R. L., Fodor, S.P.A. and Gingeras, T.R. (2002) "Large-Scale Transcriptional Activity in Chromosomes 21 and 22." Science, 296(5569):916-9).

The track is colored blue in areas that are thought to be transcribed at a statistically significant level as described in the accompanying Transfrags (transcribed fragments) track and brown elsewhere. Transfrags that have a significant BLAT hit elsewhere in the genome are colored a lighter shade of blue. Transfrags that overlap putative pseudogenes are colored an even lighter shade of blue. All other regions of the track are colored brown. While the raw data are based on prefect match minus mismatch probe (PM - MM) values and may contain negative values, the track has a minimum value of 0 for visualization purposes.

Methods

For each data point, probes within 30 bp on either side were used to improve the estimate of expression level for a particular probe. This helps to smooth the data and produce a more robust estimate of the transcription level at a particular genomic location. Specifically the analysis method used was as follows:

  • Replicate arrays were quantile-normalized and the median intensity (using both PM and MM intensities) of each array was scaled to a target value of 44.
  • The expression level was estimated for each mapped probe position by
    • collecting all the probe pairs that fell within a window of ± 30 bp.
    • calculating all non-redundant pairwise averages of PM - MM values of all probe pairs in the window.
    • taking the median of all resulting pairwise averages.
  • The resulting signal value is the Hodges-Lehmann estimator associated with the Wilcoxon signed-rank statistic of the PM - MM values that lie within ± 30 bp of the sliding window centered at every genomic coordinate.

Credits

  • Data Generation and Analysis: Transcriptome group at Affymetrix - Bekiranov S, Brubaker S, Cheng J, Dike S, Drenkow J, Ghosh S, Gingeras T, Helt G, Kampa D, Kapranov P, Long J, Madhavan G, Manak J, Patel S, Piccolboni A, Sementchenko V, Tammana H.
  • Data Presentation at UC Santa Cruz - Chuck Sugnet.