This track shows alignments of the $organism genome with itself, using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. The system can also tolerate gaps in both sets of sequence simultaneously. After filtering out the "trivial" alignments produced when identical locations of the genome map to one another (e.g. chrI maps to chrI), the remaining alignments point out areas of duplication within the C. elegans genome. A second filter was applied to remove chains scoring less than 20,000.
The chain track displays boxes joined by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the query assembly or an insertion in the target assembly. Double lines represent more complex gaps that involve substantial sequence in both assemblies. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one of the assemblies. In cases where there are multiple chains over a particular portion of the $O_organism genome, chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes.
The display indicates the chromosome, strand, and location of the match for each matching alignment (in thousands).
The genome was aligned to itself using blastz. Trivial alignments were filtered out. The remaining alignments were converted into axt format and the resulting axts are fed into axtChain. This program organizes all the alignments between a target and a single query chromosome into a group and makes a kd-tree out of all the gapless subsections (blocks) of the alignments. Next, maximally scoring chains of these blocks were found by running a dynamic program over the kd-tree. Chains scoring below a threshold were discarded. The remaining chains are displayed in this track.
Blastz was developed at Pennsylvania State University by Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison.
Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. For this track, the RepeatMasker output for the $o_organism assembly was used for lineage-specific repeats of both query and target.
The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.
The browser display and database storage of the chains were made by Rachel Harte and Jim Kent.
Human-Mouse Alignments with BLASTZ. Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison R, Haussler D, and Miller W. (2003). Genome Res 13(1):103-7.
Scoring pairwise genomic sequence alignments. Chiaromonte F, Yap VB, Miller W. (2002). Pac Symp Biocomput 2002:115-26.