Motif Matcher

A program by Jim Kent for finding where a given motif occurs in sequence data. Paste up to six motifs in the small text areas and your sequence data in the large text area below. Click this link for help.

Ignore Location Maximum Occurrences per Sequence Reverse Complement Too

Motif Matcher Help

A motif is something that can recognize a consensus sequence which may not be completely conserved. An example of a simple motif is:

A 0.1 0.7 0.1 0.7 0.7 0.1
C 0.1 0.1 0.1 0.1 0.1 0.1
G 0.1 0.1 0.1 0.1 0.1 0.1
T 0.7 0.1 0.7 0.1 0.1 0.7

This motif recognizes the consensus sequence TATAAT. The numbers represent the probability of finding each nucleotide at the corresponding position of a motif. Motif Matcher can also take motifs that are in the form of counts rather than probabilities. This lets you construct motifs from conserved areas of multiple alignments easily. For instance you could represent the multiple alignment:

C A T G G T C A T
C A T C A T C C T
C A C G A T A A T
C A T G A T C A T

as the motif:

A 0 4 0 0 3 0 1 3 0
C 4 0 1 1 0 0 3 2 0
G 0 0 0 3 1 0 0 0 0
T 0 0 3 0 0 4 0 0 4

Motifs may also include location information (and in fact they must if you deselect the Ignore Location check box). Generally Motifs with locations come from the Improbizer program. Here is an example of a motif with location information:

6.7333 @ 44.13 sd 8.88 TTACAG

a 0.395 0.367 0.525 0.010 0.990 0.003
c 0.098 0.183 0.074 0.526 0.003 0.003
g 0.033 0.059 0.089 0.003 0.003 0.990
t 0.474 0.391 0.313 0.461 0.003 0.003

The location line is just added to the nucleotide probability lines. The format of the line is:

score @ mean sd standardDeviation consensusSeq

Where: "score" is how strong Improbizer thought the motif was in the original data (score can be omitted); "@" must be present and helps distinguish location lines; "mean" is the average position of the left hand end of the motif in the sequence set; "sd" must be present and helps distinguish location lines; "standardDeviation" is the standard deviation of the motif position (in nucleotides); "consensusSeq" shows the most common nucleotide at each position (and can be omitted).