Patterns
Patterns can be either oligonucleotides (e.g. CACGTG,
CACGTT) or dyads (e.g. CGGn{11}CCG).
Patterns can be entered either in the text area, or by specifying a
file on your machine (upload). Each pattern must appear as the first
word of a line. Lines starting with a semicolon (;) are ignored.
Strands
Strand sensitive or insensitive assembly.
With the strand insensitive option, patterns can be used either in
direct or reverse complement orientation for assembly. For each
pattern, the orientation which offers the best match is chosen.
Score column
Pattern assembly is a NP-hard problem, i.e. the time of calculation
increases exponentially with the number of patterns. Beyond a certain
number of patterns, it is impossible to envisage all possible
assemblie in order to select the best ones. pattern-assembly
implements a heuristic which is sensitive to the order of entry of the
patterns. When a score column is specified, patterns are incorporated
accordingly to their scores (higher scores are incorporated first.
Maximum flanking residues
The flanking segment is the portion of a fragment that extends outside
of the assembly on which it is aligned.
maximum substitutions
maximum allowed substitutions for incorporating a pattern into a
cluster.
Maximum number of patterns
Assembly takes a huge time when too many patterns are submitted. In
any cases, when too many patterns are obtained fom a motif discovery
program, it generally reflects a problem (redundant sequences, wrong
selection of the threshold). The option can however be changed to any
value, but the time increases quadratically.